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SOME DISTRIBUTIONS OF SAMPLE MEANS 


GEORGE W. Brown AND JOHN W. TUKEY 


RCA Laboratories and Princeton University 


1, Summary. It is shown that certain monomials in normally distributed 
quantities have stable distributions with index 2“. This provides, for k > 1, 
simple examples where the mean of a sample has a distribution equivalent to 
that of a fixed, arbitrarily large multiple of a single observation. These examples 
include distributions symmetrical about zero, and positive distributions. 

Using these examples, it is shown that any distribution with a very long tail 
(of average order > x °”) has the distributions of its sample means grow flatter 
and flatter as the sample size increases. Thus the sample mean provides less 
information than a single value. Stronger results are proved for still longer 
tails. 

2. Introduction. This paper derives and exploits certain elementary ex- 
pressions for stable distributions. The practicing statistician may be inter- 
ested in the general discussion of results, going as far as Section 5. The reader 
interested in probability theory may be interested in 

(i) the simple monomials in normally distributed quantities which are 

shown to be stable (Section 7) 

(ii) the resulting bounds on the densities of these stable distributions 

(Section 8) 

(iii) Theorem A, which forms a partial converse to the Central Limit 

Theorem. 

It should be pointed out that examples of stable chance quantities arising from 
infinite series (Khintchine 1937, [2], [3]) and integrals (Levy 1935, [4]) are already 
known. These results form a natural part of broader investigations into 
(i) the relative value of the mean, the median, and their competitors 
(ii) the properties and distributions of simple functions of normally dis- 
tributed quantities. 

3. Stable distributions. One of the typical properties of the normal dis- 
tribution with zero mean is that the distribution of the mean of a sample of n 
has the same shape but is compressed by the factor ~/n. The Cauchy dis- 
tribution is well-known for the property that the mean of a sample of n has 
the same distribution as a single observation. 

Statisticians have not widely appreciated the fact that there are symmetric, 
smooth distributions for every positive \ < 2, with the property that the dis- 
tribution of the mean of a sample of n has the same shape as the original dis- 
tribution but is spread out in the ration“. These are the symmetric stable 
distributions of index \. 

It is interesting to note that if \ = .001, then the mean of a sample of two 
is 2° times as variable as the mean of a sample of one. For small \ the means 
become unduly variable with a rapidity which is difficult to comprehend. 
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GEORGE W. BROWN AND JOHN W. TUKEY 
4. Outline of results. Section 7 is devoted to the proof that certain mono- 
mials in normal variables are stable of index 2“ for integral k. Both symmetri- 
cal and positive cases are shown to exist. For k = 0, the symmetrical case is 
the familiar Cauchy distribution, which is the distribution of Student’s ‘7?’ 
on one degree of freedom, while the positive case for k = 1 is the distribution 
of Snedecor’s ‘“‘F’’? on © and 1 degrees of freedom. 
In Section 8 it is shown that the symmetrical stable distribution of index \ 
has a density which is 
(i) bounded by a constant 
(ii) bounded by a constant times |x|", for the values \ = 1, 3, 4, 4, 
-++ , for which elementary examples are available. It is conjectured that 
this is true for all A < 2. . 
In section 9 it is shown that, if a distribution has one long tail in the sense that 
(1.1) lim |x|" Pir << X <a+h}>0, 
Ps 
for some h and one of the above values of \ (the lim may be taken either as 
«—>+e« oras « — —~), then the distribution of the sum of a sample of n 
spreads out as fast as for a stable distribution with the same value of \. This 
may be restated for the mean as follows: 
(i) A distribution has a long tail of order | x\~"*’ if (1.1) holds for some 
h > Oand choice of sign for x. 
(ii) If the distribution has a density f(x), then (1.1) is a consequence of 


A 
(1.2) fz) 2 i+)? A>O. 


(iii) The distribution of the mean of a sample of n will be said to spread out 
as fast as n", if the distance between any two percentage points for the mean of a 
sample of n is ultimately larger than a fixed multiple of n, 

(iv) THrorem A. [f the distribution of X has at least one long tail of order 
| x ear where \ = 1, 3, 4, «++ , then the distribution of the mean of a sample 
of n values of X spreads out as fast as n oe 
Section 10 presents a simple example of a distribution symmetric about zero 

with such long tails that 

(i) the distribution of the sample mean spreads out faster than any power 
of n, 

(ii) the median of a sample of any size fails to have finite moments of 
positive order, integral or fractional. 

5. Consequences for applied statistics. The basic consequences of these 
results for applied statistics can be summarized in the following statements. 

(a) The positions that the Cauchy distribution is an isolated case, or else 
an extreme example of pathology, are now untenable. 
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(b) The use of the mean of a sample as a measure of location (or, when 
dealing with positive distributions fixed at zero, as a measure of scale) im- 
plies a belief that the tails of the underlying distribution are not too long. 

(ec) It is probable that the relative efficiencies of mean and median are 
greatly affected by the length of the tail. 

The importance of this last statement lies in the fact that direct empirical 
evidence about tail length is very hard to obtain. The mean is well known 
to be more efficient when the underlying distribution is normal. Normality of 
the tails of practical distributions is rarely based on firm empirical evidence. 
In these practical cases, greater efficiency of the mean should often not be 
assumed without empirical confirmation. 

It may be argued that the results of this paper apply to the limit as n— « 
and to the behavior of the distribution near infinity, while the practical problems 
involve moderate values of n and the behavior of the distribution near its 5%, 
1%, 0.1%, 95%, 99%, and 99.9% points. This is undoubtedly true, but the 
authors believe, and have some evidence to confirm, the following correspon- 
dence principle: 

If certain mathematical tails imply certain asymptotic behavior, then 
similar practical tails imply similar behavior in moderate samples. 

Here ‘‘mathematical tails’ refers to behavior at infinity while practical tails 
run from the 5% to the 0.1% point and from the 95% to the 99.9% point. 

It is of some interest to point out that Snedecor’s “F” provides applications 
of Theorem A. If NV values of F are averaged, where each was obtained on 7; 
and 72 degrees of freedom, then as V increases 

(i) if m2 > 2, the average converges to 1 (i.e. all percent points converge 
to 1), by the Central Limit Theorem 

(ii) if m2 = 2, the percent points of the average stay a finite distance away 
from each other, by Theorem A 

(iii) if me = 1, the percent points of the average separate from each 
other at least as fast as a constant times ~/N, by Theorem A. 

The consequences of Theorem A follow from the asymptotic density of F, 
which is a constant times fF)?" 

6. Notation and terminology. Chance quantities (random variables) 
will be denoted by capitals and their values by lower case letters. The same 
letter will generally be used, so that x will frequently be a value of YX. 

The letter S, with or without indices, represents a standard deviate (nor- 
mally distributed quantity with zero mean and unit variance). Unless other 
wise specified all sets of chance quantities will be assumed to be independent. 

Cumulative distribution functions will be referred to simply as ‘“‘cumulatives” 
and will be denoted by capitals. Probability density functions will be referred 
to as ‘“‘densities’’ and will be denoted by the corresponding lower case letters. 

The convolution of two cumulatives F and G will be denoted by F*G. It is 
the cumulative of sums of two independent values, one from each distribution. 
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7. Special stable distributions. Cauchy (1853, [1]) recognized that dis- 
tributions with characteristic functions of the form 


—alul> 
é 


were stable. A distribution is stable if whenever k and / are positive and A 
and B are independent chance quantities distributed according to the same 
law, then kA + JB is distributed like a fixed multiple of A. It is known (Lévy 
1937, [5], pp. 94 ff.) that any stable distribution has a characteristic function 
of the form 


. 


—(at+iBsgnu)| ular 
é > 


where 0 < A < 2,a > 0, and |8| < |atan 3m\|. Each stable distribution 
thus has an index \ such that kA + 1B and (k* + l‘)\"*4 have the same dis- 
tribution when A and B are a sample of two from the given distribution. 

This section exhibits, for every integral k, simple monomials of standard 
deviates which have stable distributions of index 2™. 


(7.1) THrorem: Let S, So, Si, Se, --- be a sequence of independent standard 
deviates. Then 
(i) Co = S/So and Py = 1 
are stable of index 1 = 2°”. 
(ii) C, = S/SoSi = Co/S: and Pi = 1/Si = P)/Si 
are stable of index } = 2”. 


ose , o/s oO 2 (22 1 o2 
(11) C's = S/SoSiS3 = Ci So 


1 2 o22 1 cy22 
and P, = 1, SiS. = P,;/ Se 
are stable of index } = 2°. 
. \ 4 ’ y2k 7 / yok 
(iv) in general, Cy = Cya/S, and Py = Pyii/S; 


are stable of index 2“ 

The C;, are a sequence of symmetrically distributed chance quantities which 
are here presented as monomials in normally distributed chance quantities and 
whose stability properties imply for k > 1 that the distributions of means of 
samples spread out as the sample size increases. The P;, are a similar sequence, 
all of whose values are positive. 

The stability properties of the C;, follow, directly, by means of elementary 
composition properties of characteritic functions, from 
(7.2) Lemma: The characteristic function of Cy, ts 


E(e**) = exp(—2 | 3¢ |?*). 


' 


Proor: The case k = 0 is the familiar Cauchy distribution. Denoting the 
normal cumulative by N(s), it is seen that 


E(e'°) a | ef ta/e0 dN(s) dN (80) 


= [ exp (—3t’/s)) dN (so) 


oO 


<-. etl . 
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The second definite integral is well known (e.g. Formula 495 in B. O. Pierce’s 
table). Assuming the result for k—1, write 


E(e***) = | . | exp (itCy-1/st) dF x1(Cr-1) AN(sx) 


9 


I exp (—2 | 3 7"""/st) dN (sx) 


= exp (—2/4t|"), 

precisely as in the derivation for k = 0. 

The stability properties of the P;, follow, by completely analogous use of 
the moment generating function, from 
(7.3) Lemma: The moment generating function of Pi. is 

—tP, 9-k 
E(e*) = exp(—2(4t)” -), ¢ > 0. 

Proor: The trivial case k = 0 is verified directly, since Pp = 1. The induction 
from k—1 to k is identical with the derivation of (7.2), as is seen by writing 


E(e ***) = [ [ exp (—tPju1/sz_) dGz-1(Px) dN (sx) 


[ exp (—2(3t)" “""/si) dN (sx) 
= exp (—2(4t)""). 

In order to verify the stability properties, consider distributions with char- 
acteristic functions of the form exp(—d|t|‘). If A and B are independently 
distributed according to this distribution, then 

Ye") = E(e'"*) E(e*"®) ai ebm) |e 
for l, m > 0. Parallel application of the moment generating function yields 
precisely analogous results. 

8. Some auxiliary results. It is the purpose of this section to establish 
some results concerning stable distributions. It will be convenient to state 
and prove some of these lemmas in general form. 

(8.1) Lemma: If X has a density f(x) satisfying 


fiz) = Aizl™, 


then X has finite negative moments of orders down to —(1—a). 
Proor: If —(l—a) < 8 < 0, then 


|x |e) < Ale[**, 
with —a+s > —1. Now 
oo —l 1 x“ 
[ ‘el f(a) dx < [ fix) de + [ ia? fx) dx + | f(a) dz 
faa l— 20 —] 1 


co 


oc 1 
< [ f(x) dx + | Al\a|°" dz < @, 
— oC — 


which proves the lemma. 
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(8.2) Lemma: Jf X has a density f(x) satisfying 
f(z) < A|x|™ 


and if Y has a density g(y) and a finite negative moment of order —(1—a), then 
the density h(a) of XY satisfies 


h(w) < Ai|a[*. 


Proor: The density A(x) satisfies 


h(x) =| if(a/Dg(t)/\ t| } dt 


4) 


< | A|t|*|a!* g(t) |¢|7" de 


x 


r 2 \ 
= {| A |t|\"™ g(t) at la |[" = A, |2{™. 


\ d—2 
(8.3) Lemma: The density hi(y) of 


Ye = S(S1)°(S2)" +++ (Si)” 


+ 


CY 


where S, Si, So, +++ S; are independent standard deviates, satisfies 


iy) < Aly [?", 


, ee ° —k 
and hence Y;, has finite negative moments of all orders down to —2™. 
Proor: Let g(a) be the density of 


- y \ ok 
X; = (Sx) 


/ ? 


then 


oi—-k 


Ps ¢ —io—k 2 - 
g(x) = (27) *2exp(—4a°” x 


—l|l-+ 


whence 


g(a) < Ar|x pee 


' 


For k = 0 this is the desired result; the other cases follow by induction, using 
VY, = N;.Y%-1 and lemma (8.2). The final statement oi the lemma then follows 
from lemma (8.1). 
(8.4) THrorem: For \ = 2, the density my(x) of C). satisfies 
-_ +9—k) ' —(1+) 
(*) mix) CA le oe ae le 


. 


and also 
(**) mm (x) < Ae. 
Proor: By definition, C, = S/¥),. By lemma (8.3) the density of Y; satisfies 


hily) < Ail y(t? 


> 


ery ESET 
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The density of 1/Y;, satisfies 


h(z) = ¥ n(/2) 


2 j1—2—-% (14-27% 
< (2/7 Ail2| «ier, 


Since S has a finite moment of order 2“, it follows from lemma (8.2) that the 
density of S/Y; satisfies the desired relation (*). Since S has finite moments 
of all positive orders, so does S* and therefore Y;. Thus 1/Y; has moments 
of all negative orders, including —1. Since the density of S is bounded, lemma 
(8.2) implies the same for S/Y;, and hence for C;,. This completes the proof 
of the theorem. 
9. Distributions with a long tail. The purpose of this section is to prove 
(9.1) THrorem: If D has a cumulative F(x) such that for some h > 0, either 
F(a + h) — F(x) 


lim 1g on > 0, or lim “ao” CU 








rI7+n jo | z2——x 


where \ = 2° fork = 0,1, 2, +--+ , and if k,(a) ts the a-point (100e percent point) 
of the distribution of sums of n independent values of D, then 


Kala) — Kp(ae) 


lim 
— nur 


n 


> 0, 


whenever a, > a. 
We begin with some lemmas. 
(9.2) Lemma: If 
F(x) = BF'(z) + (1 — 8)F”(z), | 
0O< 
G(x) = BF'(x) + (1 — B)1(2), 
where F’(x) is a cumulative symmetric about zero and unimodal, F’’(x) is a cumula- 
tive symmetric about zero, and 1(x) is the cumulative concentrated at zero (whence 


F(x) and G(x) are cumulatives), and if F,(x) and G,(x) are the cumulatives of 
sums of samples of n from F(x) and G(x) respectively, then 


F(x) < G(x), x > O, 
F(z) > G(x), x < 0. 
Proor: We begin with the case n = 2, where 
F, = 6 F'*F’ + 28(1 — B)F’*F” + (1 — 6)°F"*F” 


6B <l 


and 
G. = 6 F'*F’ + 28(1 — B)F’ + (1 — 8)'4. 
The lemma will have been proved for n = 2 if we can show that 
F’'*F" (x2) < F’(x),x > 0, 
F’'*F"'(x) > F(x), x <0. 
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Now, if x > 0, 


F'*F""(x) = l F’(x — s) dF’"(s) 


= | {F’(v — s) + F’(x + s)} dF’'(s) 


<2 [ F(x) dF’"(s) = F(x), 
0 


where the first equality follows from the symmetry of F’, the inequality follows 
from the unimodality of F’, and the last equality follows from the symmetry 
of F’’. The inequality is reversed if x < 0. 

For general n, 


— Z(;)ea — B)"* Fit Paz, 


G, = >(;)ea — p)"* Fi, 
where F;, (the convolution of k copies of F’) is the cumulative for sums of k 
independent values from F’, and F; is similarly related to F”’. Since F; is 
unimodal and symmetric and since F’,_; is symmetric, the same argument can 
be applied term by term to complete the proof of the lemma. The requirement 
that F’’ be symmetric could be replaced by the formally weaker condition that 
F; (0) = } for all k. 

(9.3) Lemma: If 


F(x) = BF q(x) + (1 — 8)1(2), Ss 8s i, 


where F.)(x) is the cumulative of C;,, with X = 2”, and if K,(a) is as defined in 
(9.1), then 


lim n™” K,(a) = 6’ Ka)(o), 


where Kw(a) is the a-point for Fa) (x). 
Proor: Let F,, and F4,, be the cumulatives of sums of n from F and F,) re- 
spectively, whence 


Faya(z) = Foo(n"2). 


( 
( 


Then 


F,(2) 


"sta i pg)" Fax (2) 
} 


"ara — 6)" Fa(k” 2). 


x 
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The characteristic function of (ng)"*z is 


yer) as >(;)ea ae g)"* exp (— d (ng)? jt ef 


: 
= W\ bey pk, _ aki, a 
~ >(i)s (1 — 8)"*exp(— 7a t!"), 


where the characteristic function associated with F,,(x) is exp(—d|t|*). Thus 
we have to deal with 
ex ( - 2 t *) 
- p np 1 i 


where k has a binomial distribution with mean n8 and variance nB8(1 — 8), 
so that k/n8 converges stochastically to unity. This implies that 


: ; y1/a —d!¢|> 
lim Be“) as 
° ° _— ° l/s ° 
uniformly in every finite interval, whence (n@)“X converges stochastically 


to C,, which completes the proof of the lemma. 


(9.4) Lemma: If the symmetric cumulative F(x) has a density f(x), and if constants 
Cc; and Cy exist such that 


f(x) > min (e., @ | x [7°™ 


) 
where X = 1, 2 oe 3 7 oe then, of aF 2) 


lim | n K, (a) | > 0. 


Proor: According to theorem (8.4) there are constants d; and ds such that the 


density of C, is bounded by min (d;, d2|2|~''~’). Hence 
F(x) - BF a) (2) 
1-8 


is monotone when 8 = min (¢;/d;, ¢/d2), and hence is a distribution function. 
By lemma (9.2) the a-points of F lie outside those of BF)(x) + (1 — 8)1(2), 
and these, by lemma (9.3), increase at least as fast as An - 
(9.5) Lemma: Jf the density of D exists and equals f(x), and if either 

lim |x|**f(x) > 0, 


r—+00 


or . 


lim |x|" *f(x) > 0, 


zrI——s 
where \ = 1, 3, 4, 3, «++, then, fora, > a, 


lim ni {Ky (a) —- Re (a2) } > 0. 


no 
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Proor: Let D, and D. be independent with the distribution of D. Then 
D, — Dy» has a symmetric density given by 
aie) = [ fet sf (ods. 


If 
lim | x f(x) > 0, 


s—Fe 
then for suitable h and e > 0, 
f(z) > e} 2-9, for all x > h. 
Therefore, for x > 0, writing y = —(1 + A), 


Y 
. 





h+1 
g(x) > / flx+s)f(s\ds > h+14+2\"jh+1\=bhl|b+e2 
h 


Now 
by | be + x - = min {b;2%b ; b,2” | x a" 


and hence, for x > 0 and suitable c; > 0, c > 0, 
g(x) > min {a1, |x \"}. 
Since g(x) is symmetric, this is also true for x < 0. If 


lim | x | f(x) > 0, 


z—7—x 


then a similar argument proves the same result. 

Let K;,(a) be the a-point for the sum of values of D; — Dz», and K,(a) be 
the a-point for the sum of n values of D. The most elementary relation be- 
tween these functions is 


| Ks,(3 + 3(a1 — As)*)| < | Kila) — Kya) |. 


To see this, observe that the sum of a sample of n values of D, — Dz is the 
difference of the sums of two independent samples of n values of D, and that 
there is a probability of (a, — a:)” that both of these sums will fall between 
K,(a:) and K,,(a2). Thus the intervals (— | K,(a:) — K,(ae) |, 0) and (0, 
|K,(a1) — K,(a2)|) are each occupied by the difference with probability 
> 4a) — a2)”. Since K;,(3) = 0, the relation follows. Hence, if a: > ae, 


lim’ n-™* K, (a1) — K,(a2)} > lim n-"K;,(3 + 3a — ay)”) 





and by lemma (9.4) applied to the distribution of D, — Dz this latter lim is 
positive, which completes the proof of the lemma. 
With the ground prepared, it is now possible to complete the 
PROOF OF THE THEOREM: Let h be chosen so that 
lim |x (F(a + h) — F(x)) > 0. 


r+ 
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SAMPLE MEANS il 


This can always be done, if X is replaced by —X when necessary. Let U have 
the uniform distribution on the interval (0, 1) and consider the variable D + hU. 
This variable has a density given by 


7 . a cote F . 
oti F(x + h) (x) 
h 
and, therefore, 
lim ja|** g(x) > 0. 


zr—+00 


Let K,(a) be the a-point for the sum of a sample of n values of D, and let K*(a) 
be the a-point for the sum of a sample of n valuesof D+ hU. Since hU | <h, 
it follows that 


| K,(a) — Ki(a)| < nh. 
Therefore, if 1/A > land a > a, 


. “IN (7-7 7) . —1/\ 5 77* * 
lim n~™" { K, (a1) — Kn (a2)} — limn™* {Ki (a1) — K; (a2)}, 


zt—n2 z—<x 


and by lemma (9.5) the latter lim is positive. 

The case of \ = 1 requires a slightly more delicate argument. The sum of 
a sample of n values of hU is asymptotically normally distributed, and hence 
it is less than Agn’, for a suitable As, with probability 6. Therefore 


K,(a3) < K*(oeB) < K,(a) + Aan’ 


and the same process yields the desired conclusion. 


10. A distribution with very long tails. A somewhat pathological example 
is provided by the symmetric cumulative 
- 1 
In(e? + |x|)? 


1 
—— ae oe 
In(e + 2 )’ v2 0, 


F(x) = xz <0, 





F(x) =1- 


which has the density 
ncn cena te 
f(z) = (e + |x )){In(e + \x))}P 


Since 


lim (2x 1 Fr) = ec forall A> 0, 


=—x 


it follows from theorem (9.1) that the distribution of the sum of a sample of 
n values of X spreads out faster than any power of n. The same must therefore 
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be true of the mean of a sample of n. There is clearly no use in taking any 
kind of mean of such a sample. 

There will, of course, be something to gain by taking the median of a sample 
of 2n + 1, since the distribution of the median always shrinks together as 
m — o, and whenever, as is true here, the density is finite and continuous 
at the population median, the distributions of the sample medians shrink toward 
the population median. 

This does not prevent some pathology, however, since the cumulative for 
the median of 2n + 1 takes the form 


(2n + 1)' 

(n!)?(n + 1) 
where P(t) is a polynomial of degree n with no constant term. Thus, for large 
negative values of z, the cumulative for the median is asymptotically 

(2n + 1)! 1 
(ntP(n +1) ° {In(e + |2x!)}" 
and the corresponding density is asymptotically 
(2n + 1)!n 
(n!)'(n + 1){ln(e’ + |x|} + | 2) 


and it follows that the median has no moments of any positive order, integral 
or fractional. This is true no matter how large the sample used! 


{F(x)}""{1 +°P(F(z))}, 
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UNBIASED ESTIMATES FOR CERTAIN BINOMIAL SAMPLING 
PROBLEMS WITH APPLICATIONS! 


By M. A. GirsHick, FREDERICK MostTELLerR, AND L. J. SAVAGE 


U. S. Department of Agriculture; Statistical Research Group, Princeton Univer- 
sity; and Statistical Research Group, Columbia University 


1. Introduction. The purpose of this paper is to present some theorems with 
applications concerning unbiased estimation of the parameter p (fraction de- 
fective) for samples drawn from a binomial distribution. The estimate con- 
structed is applicable to samples whose items are drawn and classified one at a 
time until the number of defectives 7, and the number of nondefectives j, simul- 
taneously agree with one of a set of preassigned number pairs. When this 
agreement takes place, the sampling operation ceases and an unbiased estimate 
of the proportion p of defectives in the population may be made. Some examples 
of this kind of sampling are ordinary single sampling in which n items are ob- 
served and classified as defective or nondefective; curtailed single sampling where 
it is desired to cease sampling as soon as the decision regarding the lot being in- 
spected can be made, that is as soon as the number of defectives or nondefectives 
attain one of a fixed pair of preassigned values; double, multiple, and sequential 
sampling. In the cases of double and multiple sampling the subsamples may 
be curtailed when a decision is reached, while for sequential sampling the proc- 
ess may be truncated, i.e. an upper bound may be set on the amount of sampling 
to be done. In section 3 expressions are given for the unique unbiased esti- 
mates of p for single, curtailed single, curtailed double, and sequential sampling. 

One or two of the illustrative examples of section 3 may be of interest because 
their rather bizarre results suggest that some estimate other than an unbiased 
estimate may be preferable; but the discussion of estimates other than unbiased 
ones is outside the scope of this paper. 


2. The estimate f. For the purposes of the present paper the word point will 
refer only to points in the xy-plane with nonnegative integral coordinates. 

We shall need the following nomenclature. <A region R is a set of points con- 
taining (0, 0). The point (22, y2) is immediately beyond (a1 , y:) if either x2 = 
atl, y = yort. = %1,y = y+ 1. A path in R from the point ao tothe 
point a, is a finite sequence of points ap, a1, °°: , a, such that a; (¢ > 0) is 
immediately beyond a;_;, and a,¢eR with the possible exception of a,. A 
boundary point, that is, an element of the boundary B of R, is a point not in R 
which is the last point a, of a path from the origin. Accessible points are the 
points in R which can be reached by paths from the origin, while inaccesszble 
points are the points which cannot be reached by any path from the origin. 


1 This paper was originally written by Mosteller and Savage. A communication from 
M. A. Girshick revealed that he had independently discovered for the sequential probability 
ratio test the estimate f(a) given here and demonstrated its uniqueness. For purposes of 
publication it seemed appropriate to present the*results in a single paper. 
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All points are thus divided into three mutually exclusive categories: accessible, 
inaccessible, and boundary points. The index of a point is the sum of its co- 
ordinates, and the index of a region is the least upper bound of the indices of its 
accessible points. A finite region is a region for which the indices of the acces- 
sible points are less than some number n. In particular a region containing 
only a finite number of points is finite. 

Paths may be thought of as arising by a random process such that a path 
reaching a; = (2, y), a; € R, will be extended to aj4; = (a, y + 1) with probability 
p or to a4. = (x + 1, y) with probability ¢q = 1 — p. We exclude p = 0, 1 
unless these values are specifically mentioned. When a path is extended to a 
boundary point of R the process ceases. It is clear from the definitions that for 
a finite region R, paths from the origin cannot include more points than n + 2 
where n is the index of the region. This means that a path from the origin can- 
not escape from a finite region and that the probability that it strikes some 
boundary point is unity. It is clear that each path from the origin to a boundary 
point or an accessible point has probability p’gq’, if the point has coordinates 
(x,y). We will need the following statements which are immediate consequences 
of the discussion above: 

A. The probability of a boundary point or an accessible point being included in a 
path from the origin is P(a) = k(a)p%q", where k(a) is the number of paths from the 
origin to the point. We shall call P(a) the probability of the point. 

B. For a finite region >, P(a) = 1, i.e. the sum of the probabilities of the 


aeB 
boundary points is unity. 
Any region for which z P(a) = 1 will be called a closed region. 
aeB 


Of course, all finite regions are closed; but it is convenient to have a condition 
such as that supplied by the following theorem guaranteeing the closure of some 
infinite regions as well. 

THeoreM 1. A sufficient condition” that a region R be closed is that lim inf 


no 
A(n)/~n = 0, where A(n) is the number of accessible points of index n. 

Proor. We consider the ascending sequence of finite regions FR, , each con- 
sisting of the points of R whose indices are less than n. The boundary B, of 
R,, can be written as the set theoretic union K, U A, , where K, is B, M B, and 
A, are the accessible points of R of index n. If ae B, and P,(a) is the prob- 
ability of a with respect to R,,, it is easily seen that for ae K, , Pr(a) = Pla). 
Since every point of B is ultimately contained in the ascending sequence A, , 


>> P(a) = lim >> P(a) = lim >> Py (a) < 1, 


aeB no aeKy nec aeKy 


the inequality being a consequence of statement B. But >> P,(a) is mono- 


a@éeAn 


tonically decreasing because >> P,(a) is monotonically increasing with n 
aeKn 
while >> P,(a) = 1, from statement B. 


aeBy 


2 If it is desired to admit p = 0, 1, the existence of boundary points (z,0) or (0,y) re- 
spectively must be postulated. 
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[f we can show lim > P,(a) = 0 under the condition of the theorem, 
NO AEeAy 
the proof is complete. For any point ae A,, P,(a) = k,(a)p’q” ” which for 


fixed p is O(1/+/n). The sum over A, is O(A(n)/+/n) and therefore since the 
hypothesis of the theorem implies that A(n)/+/n attains arbitrarily small values 
for arbitrarily large values of n, the sum in question decreases monotonically 
to zero. 

Coro.Litary. If the number of accessible points of R of index n is bounded, the 
region 1s closed. 

That the condition given in Theorem 1 is not a necessary condition may be 
seen by examining the region FR consisting of all points except points of the form 
(2x + 1, 2y + 1) and (3, 0) and (0, 3). 

THEOREM 2. If R is closed and R contains S, S is closed. 

Proor. The proof is essentially similar to that of Theorem 1. 

Any reasonable estimate of p will be a function defined on the boundary points, 
because the boundary points constitute, so to speak, a sufficient statistic for p. 
That is, the probability of any path from (0, 0) given the boundary point a at 
which it terminates is independent of p, and is in fact 1/k(a). 

We shall construct an unbiased estimate of p for closed regions R, that is a 
function p(a), a ¢ B, such that >> p(a)P(a) = p (absolutely convergent). 

aeB 


Consrruction. Let k*(a) be the number of paths in R from the point (0, 1) 
to the boundary point a, and let p(a) = k*(a)/k(a). We remark that the defini- 
tions imply k*((0, 1)) = 1, when (0, 1) is a boundary point. 

THEOREM 3. For any closed region R p(a) is an unbiased estimate of p. 

PROOF: 


Y dla) Pla) = + © ka) pag 


aeB aeB k(a@) 


= Do k(a)p’¢’. 
aeB 
If (0, 1) is a boundary point, then k*((0, 1)) = 1 and k*(a) = 0, a ¥ (0, 1), in 
which case the sum in question consists of the single term p. If (0, 1) is not a 
boundary point, consider the region R’ obtained by deleting (0,1) from R, and 
k’(a@), the number of paths in R’ from the origin to the boundary point @ of R. 


k*(a) = k(a) — k’(a) 


2d k*(a) p'g? = 2, bla) p’g — De h(a) pig? 


aeB aeB 


=1-D k(a)p'”. 


aeB 


Now R’ is closed (Theorem 2); except for (0, 1) every boundary point of R’ is 
3’ Even if such a sum were p for a region which was not closed, we would not call the 
estimate an unbiased estimate. 
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easily seen to be a boundary point of R; and k’(«) vanishes except for the bound- 
ary points of R’. Therefore 


pt+> k(a)p’¢ = 1, 


aeB 


and the proof is complete. 

It is clear from the construction that 0 < p(a) < 1; this is rather satisfying, 
since an estimate of p outside of these bounds would be received with some mis- 
givings. 

Theorem 3 may be generalized to yield unbiased estimates of linear combina- 
tions of functions of the form p‘q" provided the points (u, ¢) are not inaccessible 
points. We need only let the point (u, ¢) play the role of (0, 1). Even though 
the point (wu, ¢) is inaccessible it may be possible to represent p‘q" as a polynomial, 
none of whose terms correspond to inaccessible points. 

It is clear from Theorem 1 that p(q@) is an unbiased estimate of p for the usual 
sequential binomial tests, but the computation may be quite heavy. It should 
be noted that the coordinate system used here differs slightly from the coordinate 
system customarily used in sequential analysis. The custom is to let the x 
coordinate represent the number of items inspected, whereas we use it to repre- 
sent the number of nondefectives; this is the only difference between the co- 
ordinates. We understand that in applications the customary procedure seems 
preferable, but we find the present coordinates more convenient for the purposes 
of ths article. 

In general p is not the only unbiased estimate of p. A necessary condition for 
uniqueness is that the region be simple, that is that all the points between any 
two accessible points on the line x + y = n be accessible points. In other 
words no accessible points of index n shall be separated on the line x + y = n 
by inaccessible points or boundary points. 

THEOREM 4. <A necessary condition that the estimate p be the unique unbiased 
estimate for the closed region R is that R be simple. 

Proor. For a region that is not simple we shall construct a function m(qa) 
not identically zero, such that 
(1) >, ma) P(a) = 0. 

aeB 
But p(a) + m(a) will be an unbiased estimate of p different from p. 

Suppose we have a closed region R which is not simple. We consider the 
lowest index n where the accessible points are separated. There will be at least 
one uninterrupted sequence of points between some pair of accessible points 
that are not accessible points. It is easy to see that all the points of this un- 
interrupted sequence are boundary points of #. Let this sequence be the points 
a: = (% — 7, y+ 7),7 = 0,1,---, 4% + yo = n. To begin the construction 
of m(a) let m(a;) = (—1)’/k(a@), 0 Sj St. The coordinates of the point a” 
above the top point of the sequence are (2) — ¢, yo + ¢ + 1), and the number of 
paths from ae” to any point on the boundary is /’’(@), where if a’ is a boundary 
point the number of paths l’’(a@’’) = 1; similarly a’ = (a + 1, yo) and l(a) is 


ee ore epee 
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the number of paths from a’ to the boundary point a with the same convention 
if a’ is a boundary point. To complete the construction of m(a), let m(a) = 
—[l’'(a) + (—1)'U’(@)|/k(e) for boundary points not members of the sequence 
under consideration.’ Before proceeding to check equation (1), we show that 
(2) u l(a) p” g ee pq yy l(a) pd on pretest emt 

Because of symmetry we need only carry out the demonstration for the first sum. 
If a’ is a boundary point l’(a@’) = 1, and for all other points a l’(a) = 0, and the 
sum is the single term p’°q”°". If a’ is not a boundary point consider the region 
obtained by deleting a’ from FR and the corresponding k’(a),the number of paths 
from (0, 0) to the boundary points of the new closed region R’. Every boundary 
of R’ except a’ is a boundary point of R. Let us extend the definition of k’(a) 
to the whole boundary of R by defining k’(a) = 0 for a not in the boundary B’ 
of R’. Then it is easy to see that 


k(a) = k’(a’)l'(a) + k’(a@). 
Now 


1= > kla)p’ 7? 


aeB 


k’(a’) 7 UV(a)p"g? + > k’ (a)p” 
aeB aeB 


k’ (a’) 7. l’ (a)p" + © ow k’(a’) pq? 
aeB 


establishing equation (2). 
We now check that m(qa) satisfies equation (1): 


t 
z. m(a)k(a)p" q* = > (—1)' pt) gre-4 — } U(a)p"q" ae > (—1)'l"(a)p"q" 
7=0 


aeB aeB aeB 


t 


7 fs 1)” pers _ ge" ms (— 1 yé gpete gre! 


7=0 


t 
= pran'( (—1)'p’ gh? — gh - (—1)'p"*) 


j=0 


= 0. 


THEOREM 5. A necessary condition that p(a) be a unique unbiased estimate of p 
for the closed region R is that there be no closed region R’ whose boundary is a proper 
subset of the boundary of R. 

Proor. Again supposing that the condition is not satisfied we shall construct 
a function m(a) not identically zero such that equation (1) is satisfied. Let 
k’(a) be the number of paths in R’ to a in B of R, understanding, of course, that 
k’(a) = Oif ais not in B’ of R’. Consider m(a) = 1 — k’(a)/k(a), m(a) is not 
identically zero because k’(@) vanishes for at least one a, but k(a) does not. 
From the closure of R and R’ it is obvious that m(qa) satisfies equation (1). 
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Two simple examples will suffice to show that neither simplicity nor the 
condition of Theorem 5 is alone sufficient to insure the uniqueness of p. The 
region consisting of the points whose coordinates are given in the following con- 
figuration and whose boundary points are 


. 
(0, 3) z 

(0, 2) x 

(0, 1) (1, 1) x x 

(0, 0) (1, 0) (2, 0) (3, 0) x 


indicated by the 2’s satisfies the condition of Theorem 5 but is not simple. On 
the other hand the region consisting of all points for which y < 3, except for the 
two points (1, 0), (1, 1) is simple but does not satisfy the conditions of Theorem 5, 
because the region consisting of all points except (1, 0) with y < 3 can play the 
role of R’. 

The authors are unable to decide whether the two conditions together guaran- 
tee the uniqueness of p as an unbiased estimate of p, and supply the following 
sufficient condition which is adequate for many practical purposes. 

THEOREM 6. A sufficient condition that a closed region have p(a) a unique un- 
biased estimate of p is that the region be simple and that there exist g,h (0 < g,h € 1) 
such that for all boundary points | gx — hy| < M. 

Proor. If there were an unbiased estimate of p different from p, subtracting 
it from p would yield an equation of the form (sum absolutely convergent): 
(3) > m(a)p’g’ = 0, 

aeB 

where m(a) is not identically zero. But this will be shown to be impossible. If 
m(a) were not identically zero, there would be an ap such that m(ao) ¥ O and 
1) m(a) = 0 for all boundary points of index less than that of ao , and 2) one of 
the coordinates of ao is less than the corresponding coordinate of any other 
boundary point for which m(a) # 0. This follows easily from the simplicity 
requirement which implies that the boundary points of index n are broken into 
two sets a) those whose y coordinates are less than the y coordinates of the ac- 
cessible points of index n, and b) those whose x coordinates are less than the x 
coordinates of the accessible points of index n.* Since the situations a) and b) 
are symmetrical we suppose without loss of generality that ap is a boundary 
point whose y coordinate is less than that of any other boundary point with 
m(a) # 0. Equation (3) may be written 


(4) m(ao)p’*q"? + p’"? Di m(ayp’”"'g* = 0, 
aeB 


afag 
icine atta aati 
4 It will be seen as the proof proceeds that if there are no boundary points to which 
alternative a) applies, the restriction g > 0 may be removed and replaced by g 2 0, simi- 
larly if there are no boundary points to which b) applies the condition h > 0 may be re- 
placed by h 2 0. 





ry 
th 


ich 


ni- 





‘ 
' 
t 
; 
3 
t 
; 
t 


UNBIASED ESTIMATES 19 


where the exponents appearing in the sum are nonnegative. But it will be shown 
that for sufficiently small p 


. g°° | m(ao)| > p| D> map" |, 
(5) ae 
aay 


which contradicts equation (4). Now 
6) | Sm(a)p* “9 "g? | < ES |-m(a) | pg" 
( Pp q|S2| |p q 


—yo—1 _r—(hygth+M+gr—h 
< y | ma) | p” vo g (hyo gx—hy) /g 


_— 


—M/g~ 


=q “=| ma) | (pg 


< geen y—yo—12z 


=| m(a) | pr "= 


pie ye-ae-t 


where all the summations range over the values indicated in (5). The summa- 
tion indicated in (5) is thus seen to be dominated by a convergent power series 
in pq’. 

Thus Theorem 6 shows that p is a unique estimate for the sequential binomial 
tests. 

THEOREM 7. A necessary and sufficient condition that p be the unique unbiased 
estimate of p for a closed finite region R is that R be simple. 

Proor. The proof follows immediately from Theorems 4 and 6. 


3. Applications and illustrative examples. 


A. Single sampling. In single sampling a random sample of n items is drawn 
from a lot containing items each of which is either defective or nondefective. It 
is customary to estimate p, the proportion defective by the unbiased estimate 
i/n, where 7 is the number of defectives observed. The boundary of the region 
defined by a single sampling plan consists of all points of index n. Now 


k((n — 2,7)) = (") and k*((n — i,4 — 1)) = ts _ i) Consequently the unique 


unbiased estimate of p is 
in ~¢e of? > / ") = i/n 
P ; ? i tl 1 i : 


It may be of interest to note that an unbiased estimate of the variance pq/n 


of the proportion 9, is ( — 7 / "\n | = cn, (n > 1); this estimate 
pmol eee ae a ~ n(n — 1)’ rere 


is obtained by the method suggested immediately following Theorem 3. 

B. Curtailed single sampling. In single sampling schemes, there is usually 
given a rejection number c as well as the sample size n. If c or more defectives 
are found in the sample the lot is rejected, but if less than c defectives are found 
in the sample the lot is accepted. It is customary to inspect all the items in 
the sample even if the final decision to accept or reject the lot is known before 
the completion of the inspection of the sample. One reason sometimes men- 


the result above. 
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tioned for this procedure is that an unbiased estimate for p is not known when 
the inspection is halted as soon as a decision is reached. We provide the un- 
biased estimate in the following paragraph. 

In curtailed single sampling the boundary points when rejecting are (x, c), 
e+ x Sn, when accepting (n —c + 1, y), y S c — 1. The region is a rec- 
tangular array and obviously simple. The unique unbiased estimate along the 
horizontal line corresponding to rejection with c > 1 therefore is 


es _fe-2+2 ees)... 25 
pee, o) =( c=? \/( c—1 y- SH 


or in words, one less than the number of defectives observed divided by one less than 
the number of observations. The unique unbiased estimate along the vertical line 
corresponding to acceptance for c > 1 is 


pine tay ("ETE /(M ett) 


that is, the number of defectives observed divided by one less than the number of ob- 
servations. We reserved the case c = 1 because it is rather illuminating. The 
construction of Theorem 3 works as usual, and we note that p((0, 1)) = 1, 
p((n, 0)) = 0 as we might expect, but p((x, 1)) = 0,0 <a <n. 

It is somewhat startling to find that the only unbiased estimate of p for cur- 
tailed single sampling with c = 1 provides zero estimates unless a defective is 
observed on the first item. We remark that the variance of this estimate is pq. 
In other words, curtailed single sampling with c = 1 is no better for estimation 
purposes than a sample of size one when the unbiased estimate p is used. 

A limiting case of curtailed sampling when n is unbounded has been con- 
sidered by Haldane’ as a useful technique in connection with estimates of the 
frequency of occurrence of rare events. The region would not be closed unless 
p = O were excluded. In our nomenclature there is a ‘rejection number’’ c 
(c > 1), and we continue sampling and inspecting until c defectives have been 
observed. The unbiased estimate’ is (ce — 1)/(j — 1), where j is the total num- 
ber of observations, and of course this is the estimate given by Haldane. 

C. A general curtailed double sampling plan. The following example will 
illustrate the sort of calculations involved in computing p for multiple and se- 
quential plans. A sample of size n; is drawn and items are inspected until 1) 
r(1 <r, S n;) defectives are found, or 2) n; — a + 1 (a = O) nondefectives are 
found, or 3) the sample is exhausted with neither of these events occurring. If 
case 3) arises, a second sample of size nz is drawn and inspection proceeds until 
a grand total of mo(71 S r2 S mi + ne) defectives is found or ny + nz — 7m + 1 


5 J. B.S. Haldane, Vature, Vol: 155 (1945), No. 3924. 
6 For the uniqueness, see footnote 4. 
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nondefectives are found. In this scheme we call 7; and rz rejection numbers 
and a an acceptance number. The unique unbiased estimate p is as follows: 


[ 9 = A 1 ) = eee — . 
(a) P((j; 71)) aeT-? j = 0,1, »m—N; 
: , a ; 
(b) Pm —a+1,i))=7— I, 8 = 1, ++ 


- ’ + yo — Ne —am+t+r—- yr ') 
(c) p(x, r2)) = ed a-no= 3s 


2(* + y\fx —xw%+r2—y—1\ ’ 
“\ 2 re — yo— 1 


mM—-n<rsentm; 


2(* om ‘\™ +m —fety- oo *) 
ye ee ee a es . A ae 


2(* + 7 +nm—rmt+y— yw *) 
= Xo YY — Yo 


ax<xysemt+m; 


where the summations extend from yo = a + 1 toys = 1 — 1, and x + yo = mM. 
In the above equations (a) and (b) are the estimates corresponding to rejection 
and acceptance on the basis of the first sample, while (c) and (d) correspond to 
rejection and acceptance when a second sample has been drawn. Rather than 
use the sums indicated in (c) and (d), some may find it preferable to make the 
estimation entirely on the basis of the first sample. If there is no curtailing, 
the procedure of estimation is equivalent to single sampling, and the estimate is 
again 7/n, as mentioned in paragraph A above. [f the first sample is curtailed 
and the estimate is made on the basis of the results of the first sample only, the unique 
unbiased estimate is given by formula (a) when rejecting, by formula (b) when ac- 
cepling, and by i/n, when a second sample is to be drawn. It will be noted that 
(a) and (b) are identical with the expressions derived in paragraph B over the 
range of values for which they are valid. 

D. The sequential probability ratio test. Using the nomenclature of sequential 
analysis,’ the criterion for a decision is given by two parallel straight lines in the 
dn-plane 


(7) d, = hy + sn (lower line) 
dz = hy + sn (upper line), 


where d is the number of defectives and n is the number of observations. The 
acceptance and rejection numbers for any n are given by a, and r, , respectively, 


7See, for example, Sequential Analysis of Statistical Data: Applications, Section 2, 
Columbia University Press, 1945. 
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where a, is the largest positive integer less than or equal to d;, and r, is the | 


smallest integer greater than or equal to d;. We let ka(n) be the number of 
paths from the origin which end in a decision to accept on the nth observation; 


k,(n) is similarly defined when rejection occurs on the nth observation. We | 
also require an auxiliary sequential test with acceptance and rejection numbers 
° | 


/ ° . . . 
Qn = An — 1, Pn1= Pn — 1 (which is equivalent to replacing h; and he by h, + 
. . =\ % . / ‘ / \ 
1 — sand hz, — 1 + s in the equations (7)), and with k,(n) and k(n) the number 
of paths from the origin which lead to acceptance or rejection on the nth observa- 


ron mma 


tion for the new test. A graphical comparison of the two plans shows that: | 


The unique unbiased estimate of p is 
p(n) = ka(n — 1)/ka(n) 
when the original test leads to a decision to accept, and 
p(n) = k(n — 1)/ke(n) 


when the original test leads to a decision to reject on the nth observation. 


E. Regions with narrow throats. Let us consider the case of a closed region | 


which has only one accessible point of index n, n > O (n being the lowest index 


RE 


not zero at which this phenomenon occurs). The number of paths from the | 
origin to this accessible point a’ we will denote m, while the number of paths | 


from a’ to a, boundary points of index greater than n, will be denoted [(a). 


ee 


Then the total number of paths to a from the origin is m/(a). We use the con- | 
struction preceding Theorem 3 to get p(a). The number of paths from (0,1) to | 
a is similarly m*/(a), so for such points p(@) = m*/m. In other words, if a} 


closed region has a narrow throat such as that described, p(a) for a of index 
higher than that of the accessible point a’ are independent ot the shape of the 
region beyond the line x + y = n, and in fact they are all identical. The cur- 
tailed single sample with c = 1 is a particular case of a region with a narrow 
throat. 


4. Estimation based on data from several experiments. In the previous dis- 
cussion we have been concerned with estimation based on the result of a single 
experiment. Various kinds of acceptance sampling plans have been suggested 


Ne COA ET 


Peer Rg OO DEER RING RRR > 


as examples of the possible experiments. Acceptance sampling is one of many | 


activities where data toward the estimation of p are often accumulated in a series 
of experiments. It has been pointed out by John Tukey that when information 


is available from several experiments the estimate p will no longer be the unique | 
unbiased estimate of p. Little has been done on this problem of combining | 
information from several experiments, but to illustrate the point, we will discuss } 


a very simple example in terms of acceptance sampling. 

Let us suppose that two large lots of the same size are inspected according to 
the following curtailed single sampling plan: if a defective occurs at the first or 
second observation, sampling is stopped and the lot is rejected; if the first two 
items inspected are nondefective, we accept the lot. 
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The total number of defective and of nondefective items in the two samples 
form a sufficient statistic for p. In a single application of the sampling plan 
the boundary points with their probabilities are (0, 1), p; (1, 1), pq; (2, 0), @. 
From this information we can generate the possible totals of defectives and of 
nondefectives which may arise when samples are drawn from two lots, with their 
probabilities by expanding 


(8) PtpgtgP = p+ pe +g + 2 pq + 2 pg + 2pe, 


where a term on the right of the form mpg? is the probability that in two samples 
there will be x nondefectives and y defectives altogether. On the basis of the 
observed number pair (x, y), which may be regarded as a possible terminal point 
a for the two experiments performed successively, we wish to form an unbiased 
estimate e((x, y)) = c(a). For the estimate e to be unbiased the condition 
> e(a)P(a) = p must be satisfied, where in the present example the P(a) are the 
six terms on the right of equation (8), and the e(a) are the estimates with which 
the six probabilities are associated. 

In the example under consideration the condition for unbiasedness will be 
satisfied if and only if e((0, 2)) = 1, e((4, 0)) = 0, e((1, 2)) = 3, e((2, 1)) = 
[1 — e((2, 2))]/2, e((3, 1)) = e((2, 2))/2. Consequently a one parameter family 
of unbiased estimates is available. Unfortunately the popular condition that 
the variance be a minimum depends on the true value of p; in fact the variance 
is minimized just when e((2, 2)) = 1/(2 + p). So an unbiased estimate of uni- 
formly minimum variance does not exist. In practical applications to accept- 
ance sampling one might meet this difficulty by choosing a value of p near zero 
for such a minimization scheme. 

However it is clear that the last word has yet to be said about how best to 
estimate p when one is faced with the results of several experiments. 


5. Conclusion. We would like to call attention to a few problems raised by 
but not solved in this paper: 1) find a necessary and sufficient condition that p 
be the unique unbiased estimate for p; 2) suggest criteria for selecting one un- 
biased estimate when more than one is possible; 3) evaluate the variance of #. 

In this connection, in a forthcoming paper by M. A. Girshick, it will be shown 
for certain regions, for example for those of the sequential probability ratio test, 
that the variance of p(a), 


o5 > pq/E(x + y), 


where E(x + y) is the expected number of observations required to reach a 
boundary point. 











DISTRIBUTION OF SAMPLE ARRANGEMENTS FOR RUNS 
UP AND DOWN 


By P. S. OLMSTEAD 


Bell Telephone Laboratories, Inc. 


1. Summary. Using the notation of Levene and Wolfowitz [1], a new 
recursion formula is used to give the exact distribution of arrangements of n 
numbers, no two alike, with runs up or down of length p or more. These are 
tabled for n and p through n = 14. An exact solution is given for p > n/2. 
The average and variance determined by Levene and Wolfowitz are presented 


in a simplified form. The fraction of arrangements of nm numbers with runs | 


of length p or more are presented for the exact distributions, for the limiting 


Poisson Exponential, and for an extrapolation from the exact distributions, | 


Agreement among the tables is discussed. 
2. Introduction. Assume that 


X15 %25 °°? Xn 


represent a series of repetitive measurements. In engineering work, experience | 
has shown that, when the values of these measurements exhibit changes in level, | 


trends, cycles, etc., it is usually indicative of the presence of findable causes. 
In general, the engineer becomes more confident that a findable cause exists 
for a change in level, a trend, or a cycle, when the change is large, the trend is 
long, or the cycle is regular. 

On the basis of this experience, the engineer selects particular measures of 
change in level, length of trend, ete., to guide him in deciding when it is profitable 
to look for a cause. Having selected the measure, he is interested in knowing 
how often he may have to look for a cause that does not exist. One such measure 
is the length of the longest run up or down ina sample of n values. The chart 
in Figure 1, based on the analysis given here, applies when no two values are 
alike and indicates the fraction of all nonidentical arrangements that have 
runs up or down of length p or more. 

Attention is directed to the distribution of sample arrangements that have at 


gC RN EI TT SS 


least one run up or down of length p or more. The distribution and the vari- | 


ances and covariances for lengths of runs up and down are given by Levene and 
Wolfowitz [1]. In addition, Wolfowitz [2] has shown that the limiting distribu- 
tion for a particular length of run up or down is a Poisson Exponential. 
The notation of Levene and Wolfowitz [1] will be used. Thus, let a, , a, 
- , a, be n numbers, no two alike, and let the sequence S = (hi, ho, +++ , ha) 


be any permutation of a, a2, --*,a,, Where S is to be considered a chance § 


rariable, and each of the n! permutations of a; , a2, +--+ , a, is assigned the same 
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probability. Consider the derived sequence R whose 7th element is the sign 
(+ or —) of Aign — hi, (@ = 1,2, ---,n— 1). A sequence of p consecutive + 
signs immediately preceded by a — sign is called a run up of length p or more; 
a sequence of p consecutive — signs immediately preceded by a + sign is called 
a run down of length p or more. When such a run is both immediately preceded 
and immediately followed by an unlike sign, it is a run of length exactly p. 
The distribution of arrangements with at least one run up or down of length 
p or more is considered under five specific headings: 
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| e 
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Fig. 1 


1. An exact numerical solution for n small, i.e., computations have been 





completed up to and including n = 14. 
2. An exact solution for p > 5° 
— : : 1)! 
3. A limiting solution for (p + ) = constant. 
n 


4. An extrapolation from n small. 
5. Constant probability relationships. 


3. Solution for n small. Starting with a single number, a; , a second number, 
a; > a, , may be placed before or after it to obtain the two independent arrange- 
ments of one run of length exactly 1. A third number, a3; > a2 > a,, may be 
placed before, between, or after the preceding pair to obtain two independent 
arrangements of one run of length exactly 2 and four of two runs of length ex- 
actly 1. Continuing this process it is seen that, on the assumption that the 
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distribution of independent arrangements for (n — 1) numbers, a1 < a2 < a3 < 

< Gn-1, is known, the distribution of independent arrangements for n 
numbers, ai; < dz < a3 < +++ < a,, can be found by using the following re- 
cursion formula: 


Filfunt s feats. °°" gy The %*" 5 VES 


n—1 


= DV (rat WF aaltn2, m3, °°+, (tM — 1), (a + 1, +++ ml 
+ 2F ,-1[1'n-2 ’ Tn-3 2 oe (1 ne 1)} 


n—3 i-—l 


+2>> >) (n+ 1) 


j=2 j=l 
a 7 e 


. Fr-altn—s , sae (Theis + 1), eis eg: 1), ee (7; = 1), we (r = 1)] 
n—3 


+ Zz. (m, + 1)Fr-ulrn—s , sy (Text 1), --- » (% — 2), -°- ("1 — 1)| 


i=l 


where 7; , etc., represents the number of runs either up or down of exactly length 

7 in each arrangement of the n numbers designated F,, , 

(2) Stir =7, the total number of runs having lengths exactly 7 (from 
1 to n — 1) for each arrangement included in F, , 

(3) Sir; = n — 1, that is, the sum of the lengths of all such runs in any 
arrangement is one less than the total number of 
numbers, 


PF dfuns s Texts SSS gah a ee "54a G mem, 


the total number of nonidentical sequences of the n 
numbers with exactly r,_; runs of length exactly (n — 1), 
+++ 7, runs of length exactly h, --- r; runs of length 
exactly 7, --- r; runs of length exactly 7, --- 7 runs of 
length exactly 1. Some of these r’s are of course zero 
and their sum is that given in (2) above. Similar 
statements apply to the four F,_,’s. 

In the last two summations in (1), when 7; = 7“, (7; — 1) combines with 
(r, — 1) to give (7, — 2), and when 7; = 7, (7; — 2) combines with (7, — 1) to 
give (r; — 3). ° 

By using the above recursion formula, the exact number of arrangements with 
at least one run up or down of length p or more has been computed for n = 2 
to n = 14, inclusive. This information is given in Table 1. In addition, it 
has been used to determine the probabilities of arrangements with runs up or 
down of length p or more as shown in Table 2. These tables provide a useful 
background for the limiting expressions considered in the next three sections. 
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n . ie . 
4. Solution for p > 2° When p > 9? it is clear that no sequence can contain 
more than one run of length p. Thus, the expected number of runs of length 
p or more in an arrangement is also the probability that an arrangement contains 
runs of length p or more. Writing Levene and Wolfowitz’s [1] expression (4.2) 
in the simplified form previously published [3], we have 


’ ici 2(n — p)(p +1) 4+ 1] n 
(4) P(r,) = E(r,) = ————————. for 5 

, y (p + 2)! 2 
where 7, represents the number of runs of length p or more. This expression 
checks exactly with Table 2 over the range to which it applies. 


<p<n, 


! 
5. Solution for a= 


= constant. As mentioned above, Wolfowitz [2] 
has shown that the limiting distribution for runs up and down is a Poisson 
Exponential. His proof applies specifically to the distribution of runs of length 
exactly p. However, the assumptions made in his derivation could have been 
applied to the distribution of runs of length p or more and would have led to 
identical conclusions for such runs. To see how closely this is approximated, 
it is possible to throw expression (4.17) for the variance of (r,,) derived by Levene 
and Wolfowitz [1] into the following simplified form: 


or’) = = — p)(p +1) +1] E _ Ap + 1) [6p + 7(p — 1) 
, (p + 2)! (p + 2)! (2p + 3)(2p + 1) 








_ A(p + 2) 1 @ + 1)[(2p + 3)p(p — 1) — 6] 


(2p + 3)! Pl (p + 2)!(2p + 3)(2p + 1) 
2(p + 1)? + ty} ’ 3 p! 1f 1 1 
+ SR hm ee) 1 — = — 21 4+Sl me + el: 
(2p +3)! J ; pt! (2p)!} © 2L (pl? * (2p)! 

Thus, o°(r,,) is equal to E(r’,) within one part in one thousand for p > 7 and it is 
apparent that the first two moments approximate those of a Poisson Exponen- 
tial. Making use of this information, it is possible to prepare Table 3, which 
gives approximate values of the probabilities of arrangements with runs of 
length p or more based on 


(6) P(r’) = 1 ‘al e FY) a 1 a e 2la—p) @ED +1) (t2)! 


Comparison of Tables 2 and 3 shows agreement to closer than .0001 for p > 6, 
.001 for p > 5, .01 for p > 4, and .1 for p > 3 when n < 14. Similarly, the 
agreement for p = 1 is within .1 at n > 4, within .01 at n > 8, within .001 at 
m > 11 and .0001 at n > 14; the agreement for p = 2 is within .1 at n > 10. 
Possible agreement beyond n = 14 is of course subject to conjecture. However, 
it may be observed that the maximum difference for a given value of p was re- 
duced from .2679 at n = 2, p = 1 to .1691 at n = 6, p = 2 indicating that 
closer agreement may be expected as p is increased. 
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6. Extrapolation from the exact solution for n small. Since the exponential 
in equation (6) may be written in the form: 


ld —(2[(n—p) (p+1)+1]) / (p+2)! 2 +1)— / +9)! <i +1))/ (p+2)! 
(7) e [(mn—p) (p+1)+1])/(p+2)! __ e! [p(p+1)—1])/(pt+2)! e (2n(p+1))/(p+2)! 
it follows that: 


L — Prsiltp) cect) cp+20! 
” 1— Pars) ° 


TABLE 3 


Fraction of Arrangements of n numbers with Runs of Length p or More Based on Poisson 
Exponential 


2 





| | .0800 | | 
| | .2835 | .0165 | 
| .9030 | .4220 | .0800 | .0028 | 
| 9502 | .5654 | .1393 | .0165 | .0004 | 
.9744 | .6615 | .1949 | .0301 | .0028 | .0001 | 
.9869 | .7364 | .2467 | .0435 | .0052 | .0004 | .0000 | | 

.9933 | .7947 | .2953 | .0567 | .0075 | .0007 | .0001 | .0000 | 

| .9965 | .8401 | .3408 | .0697 | .0099 | .0011 | .0001 | .0000 | .0000 | 

| .9982 | .8742 | .3833 | .0825 | .0122 | .0014 | .0001 | .0000 | .0000 |.0000 | 
11 | .9991 | .9030 | .4230 | .0952 | .0146 | .0018 | .0002 | .0000 | .0000 |.0000 
12 | .9995 | .9244 | .4603 | .1076 | .0169 | .0021 | .0002 | .0000 | .0000 |.0000 | 
13 | .9997 | .9412 | .4951 | .1200 | .0193 | .0025 | .0003 | .0000 | .0000 |.0000 |. 
14} 9999 | .9542 | .5276 | .1321 | .0216 | .0028 | .0003 | .0000 | .0000 |.0000 |. 
15 | .9999 | .9643 | .5581 | .1441 | .0239 | .0032 | .0004 | .0000 | .0000 |-0000 |. 


40 | .9999 | .9165 | .3952 | .0803 | .0118 | .0015 | .0002 | .0000 |.0000 | .0000 
60 | ‘1.0000 | .9780 | .5419 | .1231 | .0186 | .0023 | .0003 | .0000 |.0000 |.0000 
80 | | | 9942 | .6530 | .1639 | .0254 | .0032 | .0004 | .0000 |.0000 | .0060 
100 | | | 9985 | .7371 | .2030 | .0322 | .0041 | .0005 | .0000 |.0000 |.0000 


20 {1.0000 | .9898 | .6834 | .2015 | .0355 | .0049 | .0006 | .0001 | .0000 |.0000 | .0000 








200 | 1.0000 | .9345 | .3717 | .0652 | .0085 | .0010 | .0001 |.0000 | .0000 
500 |< . .9990 | .6924 | .1577 | .0215 | .0024 | .0002 |.0000 | .0000 
1000 | | “ — |1,0000 | .9065 | .2919 | .0428 | .0049 | .0005 |.0000 | .0000 

















5000 | “| | « | © {1.0000 | .8234 | .1976 | .0245 | .0025 |.0002 | .0000 
| | | | | | | 


showing that consecutive values of 1 — P(r’) are related by a constant of pro- 
portionality dependent only on p. Since this is true in the limit, Table 2 was 
examined to determine similar multipliers for extrapolation. The results of 
this examination are shown in Table 4 together with the values of (8). This 
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stancy of the ratio for a given value of p is such as to permit calculation of 
probabilities for any value of n to a minimum of three or possibly four decimal 
places. Such calculations have been made and recorded in Table 5. The fol- 
lowing formulae’ were used for these calculations: 


P(r) = 1 


P,,(r2) (o0437360)( 2)" 
Tv 


P,(r3) = (.45093729) (.92404)"~"* 
P(r) = (.87587019) (.98561)""* 
P,(r5) = (.98060695) (.99760)""*° 
P,,(76) (.99752014) (.999652)""" 
or in general 
(10) P,(r,) = 1 — [1 — P,,(r,)][Constant,]"~” . 

Comparison of Table 3 with Tables 2 and 5 shows that the difference for given 
p and n has a maximum for each value of p and that this maximum decreases 
with increase in p. The maximum values of the difference shown in the tables 
are:p = 1,n = 2, .2679; p = 2,n = 6, .1691; p = 3, n = 20, .0572; p = 4, n = 80, 
0154; p = 5, n = 500, .0033; and p = 6, n = 5000, .0007. Thus, it is apparent 
that the agreement beyond p = 6 should be within .0001 and the method of 
Section 5 used for Table 3 is satisfactory for these probabilities. 

7. Constant probability relationships. From Tables 2, 3 and 5, it is pos- 
sible to make interpolations for the values of n required to have a probability of 
at least P(r’,) that an arrangement will have a run of length p or more. When 
the conditions of Section 5 apply, the value of n is, of course: 


+ 2 , 
(11) = — P= pi log. [1 — P(ry)]. 


9 
1 It will be noted that the constant for p = 2 has been taken to be— , whereas the last value 
Tv 
shown in Table 4 is .63661959. However, alternate values in this series are converging. 


- 
Comparing these subseries shows that by n = 16, the values would agree with — to eight 
TT 


2. ee 
decimal places. An analytic proof that — is the limiting value of the constant has recently 
T 


been found by J. W. Tukey. 
While reading the manuscript J. Riordan observed that the number of arrangements 
with longest length 1, say f(n, 1) has the generating function 


i” 
= J (n, 1) — 2(sec ¢ + tan t) 
nm: 


hence is twice the Euler number for n even and twice the tangent number for n odd, a result 
given essentially by Netto [4]. These observations lead directly to the limiting value, 


-noted above. ‘ 
T 
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TABLE 5 


Fraction of Arrangements of n Numbers with Runs of Length p or More Based on 
Extrapolation with Extrapolation Constant 
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200 
500 
1000 
5000 


Sample Size for Constant Probability Based on Poisson Exponential 
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.99 
.95 
90 
.10 
.05 
.O1 
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Sample Size for Constant Probability Based on Extrapolation 


| 
| 
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1 
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20 
13 
10 




















| | 

2 3 | 4 5 6 
.9954 | .5833 | .1367 | .0217 | .0028 
.9971 6150 | .1492 | .0241 | .0032 
.9997 | .7406 | .2086 | .0358 | .0049 
1.0000 .9466 | .4078 | .0810 | .0118 
” .9890 | .5568 | .1241 | .0187 
. 9977 | .6684 | 1652 |  .0255 
. .9995 | .7518 | .2044 | .0322 
” 1.0000 | .9418 | .3743 | .0653 
. « | .9992 | .6957 | .1580 
" « | 1.0000 | .9085 | .2925 
° ” « | 1.0000 | .8241 

TABLE 6 





| 4 5 6 7 8 

| 335 | 1939 13268 

| 219 | 1263 | 8633 

| 169 971 | 6637 | | 

ee 49 | 309 | 2296 | 

| 7 26 | 153 | 1170 | 10350 
} 4 9 34) 235 | 2036 


TABLE 7 





| 
| 





>? 


| 
1 2 | 3 4 5 | 6 
. | | 
<.99 | — 12 | 61 | 321 1923 | 13239 
$6 | - 8 | 40 210 1253 8614 
<.90 | — 7 | 32 162 964 | 6622 
<.10 | — (2) | 4 11 48 | 308 
$0 | — (2) | (8) 7 26 | = 153 
—-_ | (2) (3) (4) 9 
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Similarly, it may be obtained from the extrapolation formulae of Section 6 
in the form: 


log (1 — P,(r,,)] — log (1 — Pro (r’,)| 


(12) while lial log [Constant,] 


Results of computations based on (11) and (12), are given in Tables 6 and 7, 
respectively for particular values of P(r). It will be noted that Table 7 is in 
exact agreement with Table 2 and that it differs but little in a practical sense 
from Table 6. 
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THE THEORY OF UNBIASED ESTIMATION 


By Paut R. Hatmos 
Syracuse University 


1. Summary. Let F(P) be a real valued function defined on a subset “ of 
the set 9)* of all probability distributions on the real line. A function f of n real 
variables is an unbiased estimate of F if for every system, X,, --- , X,,, of inde- 
pendent random variables with the common distribution P, the expectation 
of f(X, ---, X,) exists and equals F(P), for all Pin 4. A necessary and suffi- 
cient condition for the existence of an unbiased estimate is given (Theorem 1), 
and the way in which this condition applies to the moments of a distribution is 
described (Theorem 2). Under the assumptions that this condition is satisfied 
and that 9) contains all purely discontinuous distributions it is shown that 
there is a unique symmetric unbiased estimate (Theorem 3); the most general 
(non symmetric) unbiased estimates are described (Theorem 4); and it is 
proved that among them the symmetric one is best in the sense of having the 
least variance (Theorem 5). Thus the classical estimates of the mean and the 
variance are justified from a new point of view, and also, from the theory, com- 
putable estimates of all higher moments are easily derived. It is interesting to 
note that for n greater than 3 neither the sample nth moment about the sample 
mean nor any constant multiple thereof is an unbiased estimate of the nth mo- 
ment about the mean. Attention is called to a paradoxical situation arising in 
estimating such non linear functions as the square of the first moment. 


2. Introduction. Consider the set 9)* of all probability distributions on the 
real line. The elements P of ‘)* may be regarded as either set functions P(E), 
defined for all Borel subsets E of the real line, (probability measures) or mono- 
tone non decreasing functions P(x) of a real variable x, (cumulative distribution 
functions). Suppose that F = F(P) is a real numerically valued function of 
distributions. For example F(P) may be the expectation or the standard devia- 
tion of the distribution P, or it may be the amount of probability P assigns to 
some fixed set Ey. The problem of unbiased estimation is to find a function 
(statistic) of a sample of n from a population with distribution P, in such a way 
that the expected value of this function is equal to the value of F(P) identically 
in P. More precisely, if F(P) is defined on a subset 9) of 4)*, then an unbiased 
estimate of order n over 9) is a real valued function f = f(ai---x,) of n real 
variables, which is such that for every system X,, --- , X, of independent ran- 
dom variables with the common distribution P (belonging to 4), the expected 
value E {f(X1, --- , X,)} exists and is equal to F(P). 

The problems posed in this paper are the following. (I) Which functions 
F(P) admit an unbiased estimate? (II) What are-all possible unbiased esti- 
mates of a given function F(P)? (III) Is there a reasonable definition of “‘best 
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unbiased estimate’’ which enables one to select from all unbiased estimates of a 
fixed function F(P) a unique best one?’ 

I shall present below a complete solution of these problems, under the assump- 
tion that the domain of estimation, 9), is sufficiently large. The results also 
shed light on some classical concepts. It is possible, for instance, to exhibit 
computable unbiased estimates for all moments of a distribution about its ex- 
pected value, and to prove that the known estimates of the expectation and the 
variance are essentially unique. 

The vague concept of sufficiently large estimation domain 9) is easily made 
precise. For any Borel set E on the real line let S*(E) be the set of all those 
distributions which assign the probability 1 to some finite subset of Z. Thus, 
for example, if EF consists of exactly two points then 9)*(E) is the set of all possible 
probability distributions in a dichotomy. A subset 9) of D* will be said to be 
finitely closed over E if D*(E) | 9%. Finitely closed domains are “sufficiently 
large.” 

It is clear that some restriction (from below) on the size of &) is essential for a 
discussion of the characterization problem (II) and the uniqueness problem 
(III). For if, for example, the domain 9) is artificially restricted to contain 
only one distribution, then there will always be a plethora of completely un- 
related and uninteresting solutions of the problem of unbiased estimation, none 
of which can be said to be preferable to any other one. It is true, however, that 
the assumption of finite closure is too restrictive. The general problems of 
unbiased estimation are still unsolved over such interesting and useful domains 
as the set of all continuous distributions, and the set of all absolutely continuous 


distributions. There are also more special problems connected with special 
classes of distributions (e.g. the normal and the rectangular distributions), as 
well as the general problem of characterizing the domains which are sufficiently 
large to make a uniqueness theorem possible. I hope to return to these problems 
in the near future. 


3. Existence. A function F(P), defined on a domain 9) ¢ %)*, will be called 
homogeneous over 9), of degree k = 1, 2, --- , if there exists a real valued func- 
tion ¢ = g(11,--- , x) of k real variables which is such that for every P in 
the Lebesgue-Stieltjes integral” 


| -~ [ ea, +++, a4) dP(a1) «++ dP(zx) 


1My interest in these problems stems from conversations and correspondence with 
Reinhold Baer, who first called my attention to the problem of finding unbiased estimates 
for the moments about the expected value. The general questions of existence and 
uniqueness of unbiased estimates were raised explicitly by J. F. Steffensen in a footnote 
on p. 18 of his book, Some Recent Researches in the Theory of Statistics and Actuarial Science, 
Cambridge Univ. Press, 1930. 

? All integrals in this paper are to be extended over the entire Euclidean space of in- 
dicated dimension. 





36 PAUL R. HALMOS 


exists and is equal to F(P), and if the integer k is minimal with respect to the 
property of the existence of such a representation. 

THEOREM 1. A necessary and sufficient condition that F have an unbiased esti- 
mate of order n over SD) is that it be homogeneous over SD) of degree k < n. 

Proor. To prove sufficiency, suppose that 


FP) = foo [oles ++, a1) dP(a) «++ dP Cas) 
for all P in D, with k < n. Define f by 


Jaa, °** 5 Bey Sasa, °°* » Za) = 9(%1,°°* , Xe). 


Then if X,, --- , X, are independent random variables with the same distribu- 
tion P (belonging to 9) 


BAX, Xa} = fo [pers +, ae) aPlw) +++ P(t) 
[e+ [eli, +, xy) dP) «++ dP Cay) 


= | fol, +++, a) dP(e) ++ dP) = FR). 


The necessity of the condition is even more trivial: the definition of an unbiased 
estimate of order n is such that the existence of one is equivalent to homogeneity 
of degree < n. 

As a special case, and an important illustration of how the degree is evaluated, 
consider the moments F, = F,,(P) of a distribution P about the origin, 


Fa(P) = [2" dP(o), 
and the moments F,,(P) about the expected value F,(P), 
F,.(P) = | (2 — F,(P))” dP(2). 


TueoreM 2. If YD is any subset of D* contained in the domain of definition of 
each of the functions F,,---, F,, and finitely closed over {0, 1} (where {0, 1} 
denotes the set containing the two numbers 0 and 1 only), and if ki, --- , ky are 
arbitrary non negative integers, then the function 


F(P) = Fi'\(P) --- F"(P) 


is homogeneous over S) of degree exactly k = ki + --- +k,. 
Proor. The representation of F by a k-fold integral, 


F(P) = / tee [x 72° Ley Ti+ tee Via the ++ Ui4..-¢n, UP(a1) +--+ dP(2rx), 
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shows that F is homogeneous of degree < k. That the degree of F is indeed 
equal to k is proved as follows. Suppose that 


FP) = | +++ ola, +++, an) Pla) +++ dP(as) 


for all Pin). Observe that if P is the singular distribution which assigns prob- 
ability 1 to the point 1 on the real line then the identity of the two representa- 
tions of F reduces to g(1, --- , 1) = 1; similarly assigning the total probability 
to 0 implies that 9(0, --- ,0) = 0. More generally, choose P so that it assigns 
the probability p, (0 < p < 1), to the point 1, and the probability q = 1 — 
pto0. It follows that 


p = p' + pq Qteert pq’ Phi » 


where ¢; is the sum of all o(a1, --- , 2), over those Atuples (21, --- , a,) which 
contain exactly 7 0’s and (h — 7) 1’s._ If q is replaced by 1 — p in the right 
side of the last equation, the resulting equation is supposed to be satisfied by 
allp,O<p<1. If, however, h < k, then the two sides of the equation are 
polynomials of different degrees; hence h > k. 

Corotiary. If S is any subset of D* contained in the domain of definition of 


the function F,, and finitely closed over {0, 1} then F', is homogeneous over S) of 
degree exactly m and, consequently, it has unbiased estimates over Y) of order n if 
and only if m < n. 

Proor. Since 


F,(P) = [ (@ — Fi(P))" dP(2) 


m jf{m i m-j 
= Dim (—1) @) i(P) [« dP (x) 


> (-'("") Fi(P)F»_4(P), 


the conclusions of the corollary are implied by Theorems 1 and 2. 


4. Symmetry. Theorem 1 may be regarded as a solution of the existence 
problem (I). An examination of its proof shows, however, that the estimates 
there constructed are very unsatisfactory indeed. In the special case F = F,, 
for instance, the estimate becomes f(a1, --- ,2%n) = x. The first element of a 
sample of n is, to be sure, an unbiased estimate of the expectation of the dis- 
tribution, but it is intuitively clear that, since it ignores most of the information 
at hand, it is not a good one. In order to exhibit the best estimates it becomes 
necessary to study the symmetric ones. Recall that a function f = f(m,---, 
<n) is symmetric if it is invariant under all permutations of its arguments. The 
proof of the main theorem of this section, the theorem of uniqueness for sym- 
metric unbiased estimates, is based on two lemmas. 
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Lemma 1. If Q = Q(pi, +++, pn) is a homogeneous polynomial of degree > 
0 in n real variables, such that whenever 0 < pi < 1,t = 1,--+,n,andpmt+-:-:: 
+ p, = 1 thenQ (1, +++ , Pn) = 0, then Q must be identically zero. 

Proor. (Induction on n.) For n = 1 the lemma is trivial. Assume there- 
fore that n > 1 and that the lemma is true forn — 1. Observe that the hypoth- 
esis is equivalent to the vanishing of Q for all systems of non negative arguments 
(without the restriction p; + --- + pn, = 1), since any such system {p;} can be 
replaced by {pi/(p: + ++: + pn)}. If in Q the variables pi , --- , Pn_1 are given 
any non negative values, then the hypothesis implies that the resulting poly- 
nomial in p, vanishes for all non negative values of p, , and therefore identically. 
Consequently the coefficients of the powers of p, in Q, which are themselves 
homogeneous polynomials in pi , --- , Pn_-1, vanish for non negative arguments 
and therefore (by the induction hypothesis) identically.’ 

Lemma 2. If SD is a set of distributions finitely closed over a Borel set E of the 
real line and if the symmetric function f(a, +--+, Xn) ts such that for every dis- 
tribution P in Y) the Lebesgue-Stieltjes integral 


/ — [ fe ,°t* 5 tn) dP(x,) +++ dP(xn) 


exists and has the value zero, then f(a: , +++ ,%n) = O whenever x;¢ E,i = 1, +--+ ,n. 

Proor. Consider any point (xj, --- , x.) with a} e E,i = 1, ---,, and any 
distribution P (in 9*(E)) which assigns the probability 1 to the subset {zt , --- 
x,t of E. If the probability of x? is p;, 7 = 1, --- , n, then the integral 


’ 


/ weil [fe ,*t* 5 Xn) @P(a1) +--+ dP(xn) 


is a homogeneous polynomial (of degree n) in the n variables pi, --- , pn. The 
hypotheses of Lemma 1 are satisfied—it follows that this polynomial vanishes 
identically. The symmetry of f implies that the coefficient of the term pi --- pn 
is exactly n!f(x) , --- , 2°), thereby establishing the conclusion of the lemma. 

If ¢ = o(a1,--- , x) is any function of k real variables and if n is a positive 
integer, n > k, it is convenient to write 


[n]} [n} 
g” =e" (1, -++, Xn) 
for the average of the values of ¢ over all points obtained from (2, --+ , 2) by 
extracting ordered subsets of k x’s. Thus, for instance, 
\ (3) pe 
(xyt2)” = 3 (ate + Xix3 + Lotrs) 


and 


(2,)'") _ ii (a +eee + %.). 


3 I am indebted to J. B. Rosser and R. J. Walker for this proof; my original proof of 
Lemma 1 was more complicated. 
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THEOREM 3. Let SD be a set of distributions finitely closed over a Borel set E of 
the real line and let F be a homogeneous function of degree k, 


F(P) = / _ [ oe ,*t* 5%) dP(a) --- dP (xx) 


over SD. If f(ar, +--+ , Xn) is a symmetric unbiased estimate of F over D, of order 
n > k, then for every point (a1, -++, Xn) with x; e« E,i = 1,---,n,f(m,--- 
tn) ts equal to the symmetrized function of" (a, 5 *+ ee. 

Proor. Observe first that 


? 


/ = | o(a1, +++ , ty) dP(axy) +++ dP(2;) 


remains invariant if (7, --+ , 2) is replaced by (xi, , +++ , ;,), Where {ai,---, 
2,} is any subset of {1, --- , 2}, since the change is merely a matter of notation. 
It follows that 


F(P) _ / 7 [ oc pore e Ss Xi) dP (x1) es dP(x;z) 


re | ~_ [ee So eS Ln) dP (x1) wn dP(xn), 


so that ¢'"! is indeed an unbiased estimate of F. Since ¢'"' is also symmetric, 
f- ¢'"| satisfies the hypotheses of Lemma 2, and the desired conclusion follows 
from an application of that lemma. 


5. Characterization. For any Borel set E on the real line let D*(E£) be the 
set of all those distributions which assign the probability 0 to the complement 
of E. Thus, clearly, D.(Z) ¢ D*(£); if E is the entire real line then Y*(E) = 
9)*; if E consists of a finite number of points then S).(Z) = 9*(E£). 

THEorEM 4. Let ) be a set of distributions finitely closed over a Borel set E 


of the real line and contained in D*(E), and let F be a homogeneous function of 
degree k, 


F(P) = / on | jh +> mea. Peas 


over 9). A necessary and sufficient condition that the function f = f(a, +++ , Xn) 
be an unbiased estimate of F over D, of order n > k, is that the Lebesgue-Stieltjes 
integral 


| “ [te +++, tn) dP(21) +++ dP(2n) 


exist for every P in D and that for every point (x1, +++ ,%n) witha;e Ei = 1,---, 
n, the symmetrized function f\" (a, , «++ , tn) be equal to o'™ (x1, +++, Xn). 
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Proor. If f is an unbiased estimate then f'” is a symmetric unbiased esti- 
mate and therefore, by Theorem 3, equal to ¢'”; the converse follows from the 
facts that 


/ a | flr, «++ 5 tm) dP(as) «++ dP(ae) 


ws | jas [sMe, +++, @n) dP(a1) --- dP(2n) 


and that (as a consequence of the hypothesis D ¢ D*(E)) the equality of f'” and 
¢'"! for points whose coordinates are in E implies the equality of their integrals. 

Theorem 4 exhibits all possibilities for unbiased estimates (over domains satis- 
fying the hypotheses). Given a point (21, --- , 2,), suppose that the number of 
different points obtained from it by permutations of the coordinates is N. (If 
the x; are all different then N = n!). An unbiased estimate is obtained if f is 
defined arbitrarily over N — 1 of these points and if its value on the Nth point is 
chosen so that the identity f'"? = g'"! is satisfied. As long as the arbitrary 
choices at the (possibly) uncountably infinite point groups are not too wild and 
not too large (i.e. are such that the resulting function f is measurable and integra- 
ble), f will indeed be an unbiased estimate. Typical nonpathological examples 
of unsymmetric unbiased estimates are weighted averages of the permuted values 
of v(x, +++, 2x), similar to the unweighted average ¢'"!(11, --- , tn). 


6. Uniqueness. The assumption of symmetry is a rather natural one to require 
of an estimate: it amounts to requiring that the estimated value should be 
independent of the order in which the observations are made. Theorems 3 
and 4 establish that the concept of symmetry is inherently associated with un- 
biased estimation and that, under this assumption, there is a unique unbiased 
estimate (whenever there is one at all). These theorems, therefore, constitute 
a partial answer to the uniqueness problem (III): symmetry, after all, is a possi- 
ble interpretation of “good” estimate. From another point of view the answer 
to the problem of “‘best’’ estimate is contained in the following theorem. 

THeoreM 5. Under the hypotheses of Theorem 4, among all unbiased estimates 


of 


F(P) = | ia fem, TS 


the symmetric one, o'"(a,, +++, tn) is the one with least variance or, equivalently, 
the least second moment 


[-- | thy, «+s ee a) «>> aD. 


Proor. Observe first that if X,,--- , X, are independent random variables 








vily, 
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with the same distribution P then, if f is an unbiased estimate of F(P), the 
variance of f(X1 , --- , X,) is given by 


E{f(Xi, +++, Xa)}? — E'{f(Xi, +++, Xn)}. 
Since the second term is the same for all f, namely F’(P), minimizing the variance 


is indeed equivalent to minimizing 


Big(Xr, +++ Xo = foe f ther, +, 2—)}* aPC) +++ aPC). 
This quantity need not be finite even for f’s and P’s for which E{f(X,, --- , Xn)} 
exists. It will be shown, however, to be minimized by ¢'”! in the sense that 

E{e'"(Xi, +++, Xn)}° < Etf(Xi, +++, Xn}? 


for all unbiased estimates f and all P, and that the inequality actually holds for 
some P. 


For the proof consider any unbiased estimate f of F. For any given point 
(a1, °** » %n) suppose that N is the number of different points obtained from it 
by permutations of the arguments, and denote by f; , 7 = 1, --- , N, the values 
of f at these points. Since, according to Theorem 4, f\” = ¢!"), it follows that 


n)\2 1 N : 1 * 2 a 
(g'") =(} 554) S¥ Dia fi = (F)™, 


Hence 


| mn | fe'"(a1, +++, an)}? dP(ar) +++ dP(an) 
< | vee | {P(r +++, an)}'™ dP(ar) +++ dP(an) 


= | --- [Pl, +, 20) dP) +++ dP(@s). 


This already establishes the minimal property of ¢'"! in the weak sense. 


If the inequality were an equality for all P for which the terms are defined then, 
by Lemma 2, it would follow that 


: fol (ar, ++, Ba)}” = [P'(tr, +++ tn) 
for all (a1, ---,2%n). Hence the Schwarz inequality, as applied above to the 


1 ; ; , ; 
sum 2 aed f., reduces to an equality; this can happen if and only if (fi, --- , fx) 
1 1 
is proportional to Bp gore 4) i.e. if and only if all f; are equal to each other. 
The validity of this statement for every point is equivalent to the symmetry of f 
and hence, by Theorem 3, to the statement f = ¢'"!. This concludes the proof 
of Theorem 5. 
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7. Concluding remarks. (1) The most obvious estimates of the moments, 
F,,(P), of a distribution about the origin are the sample moments 


Lyn om 

a aoe Ui. 
Their use is justified by the uniqueness theorems (8, 4, and 5) of this paper. 
Similarly one might think that the natural estimates of the moments, F’,,(P), 
about the expected value Fi(P), are best estimated by the sample moments 


1 * =\m 
Gm(21 5 BOSS Za) = na Dh (25 4 z) 


1 * ; 
about the sample mean < = . > 2;. Denote by fn(ai, «++ , tn) the estimate 


of F,,(P) obtained by expanding F,,,(P) in terms of the F,(P), as in the proof of 
the corollary to Theorem 2, and then estimating each term by the symmetric 
estimate considered in Theorems 3 and 4. Then an easy calculation shows that 


n 
fo(t1,°*+ , 2a) = ct g2(X%1, *** , Xn) 


n* 
f3(%1, *+* 5 In) = a-teaon Ga(%1,°°* , Xn). 

(These functions are the classical estimates of F, and F;.) For m > 3, fm can 
still be expressed in terms of g’s, but no longer as a constant multiple of gn. It 
appears that in general f,, is a linear combination of g; , --- , gm with coefficients 
which are rational numbers whose denominators are (n — 1)(n — 2) +--+ (m — 
m+ 1). This fact is another aspect of the non existence of unbiased estimates 
of order n for F,, when m > n. 

(2) For any Borel set E on the real line denote by F'(P) the probability, P(E), 
assigned by P to E. If ¢z(x) is the characteristic function of the set E, the 
representation 


F,(P) = / gs (2) dP(z) ‘ 
shows that Fz(P) is homogeneous of degree 1, and therefore possesses unbiased 


estimates of all orders. The symmetric unbiased estimate of order n is given, in 
perfect accordance with intuitive demands, by the function fz(a , --- , %,) whose 


ie : 
value is . times the number of those coordinates x; which belong to E. 


(3) The situation in estimating such “non linear’ functions as (Fi(P))” is 
somewhat paradoxical. In the first place it appears strange that there should be 





UNBIASED ESTIMATION 43 


essentially different processes for estimating the expected value and the square of 
the expected value. (Recall that since 


(F,(P))? = | | 2% dP) Pod), 


the symmetric unbiased estimate of (F:(P))’, of order n, is (x:t2)'"!.) Consider, 
for instance, the distribution P which assigns probability 4 to each of the points 
+1l1and—1. The symmetric unbiased estimate of order 2 for F;(P) is 3(a, + 


9 


v2), and for (F,(P))° it is z1z2. Hence in the four possible cases 
(1, 1), (@, —1), (-1, 1), (-1, -1) 
the biased, incorrect estimate {3(a1 + 22)}° for (F,(P))* yields 
1, & & & 
whereas the unbiased, correct estimate yields 
i =i, wf, | 


The actual value of (Fi(P))* is, of course, 0. Hence it is true in this case that 
whenever the biased estimate is in error, the unbiased one errs by the same 
amount. To add insult to injury, the unbiased procedure even yields negative 
estimates for the essentially non negative quantity (F,(P))*. These considera- 
tions seem to indicate the necessity for caution in using unbiased estimates of 
‘non linear’ quantities, such for instance as F,,,(P). 











SOME SIGNIFICANCE TESTS BASED ON ORDER STATISTICS 
By JouHn E. WALSH 
Princeton University 


1. Summary. In this paper significance tests are developed whose application 
requires only the determination of one order statistic and the computation of 
sums of sample values. The simplest case considered is that of testing a new 
sample value x on the basis of m previous sample values y1, --- , ¥m, all sample 
values being assumed from normal populations with the same variance. Two 
separate tests of whether the mean of the new population from which x was taken 
exceeds the mean of the population from which y; , --- , ym were drawn consist in 
accepting the alternative that the new population mean exceeds the old popula- 
tion mean if 


eS a ae ™ 
‘m+ 1 1 /——— 
(1) xz > (vet1t)yy, —-Vm+1yw 
/ 1-1\< —— 
(2) Lt > (“mt1-1) 2» Yi + Vm os L Y(m+1—u) 5 
where yu) is the wth largest of y1,--:,Ym. It can be shown that both of these 


tests have the same power so that either one might be equally well selected for 
use. In practical application, however, there may exist reasons for preferring 
one test to the other. Similarly, the alternative that the new population mean 
is less than the old population mean will be accepted if 


1\< —— 
(3) zr< (vex t)) 2d Ys — Vm + 1 Yims 
/m t+ 1 -1\é a 
(4) o< (YPtI— 1) yt Vn Fi tw. 


All four of these significance tests have the same power, also the same significance 
level a(u, m). By appropriate choice of u and m the significance level can be 
made to assume values suitable for significance tests. For example, 


a(1,6) = .0156,  «(2,10) = .0107 
a(3, 13) = .0110, (4, 16) = .0107. 


The above tests are still valid if each of x, y:, --- , ym equals a sum of r sample 
values. 

These order statistic tests are generalized to the case where x is a sum of r new 
sample values; y1, --* , Ym each equals a sum of s past sample values and another 
sum of relatively weighted past sample values is utilized but not as an order 
statistic. The introduction of this relatively weighted sum allows less reliable 
past information to be lumped together and weighted according to its relative 
importance, 


44 
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In comparing the order statistic tests with the most powerful tests which could 
be used for these alternatives it is found that the size of the samples used must be 
increased in order to bring the efficiency of the order statistic test up to that of 
the corresponding most powerful test. Thus the advisability of using the order 
statistic test will depend upon whether it is more desirable to take larger samples 
but have less computation. 


2. Introduction. Many statistical problems are concerned with the determi- 
nation of whether a new sample can be considered as having been drawn from the 
same population as that from which a previous sample was taken. Frequently 
this reduces to the question of whether the mean of the population from which 
the new sample came is greater than the mean of the past sample population. 
The problem of whether the new population mean is less than that of the old 
population is also occasionally investigated. If both populations can be con- 
sidered normal with the same variance, it is well known that the most powerful 
Studentized test of each of these one-sided alternatives is furnished by use of the 
appropriate Student t-test. When the number of previous sample values from 
which the test is determined is large, however, the computation of the numerical 
value required for the application of the Student t-test becomes lengthy. This 
calculation difficulty can become very important if the test is to be applied 
repeatedly as, for example, in quality control work. It is desirable, therefore, 
to develop other Studentized tests which are easily calculated and whose efficiency 
with relation to the corresponding Student ¢-tests is reasonably high. It is the 
purpose of this paper to develop tests of this type by the use of order statistics. 

The class of tests in which a new sample value z is tested on the basis of m 
previous sample values y: , --- , ym used as order statistics is developed in detail. 
The significance tests arising are the ones given in the summary above. For a 
better intuitive understanding of what takes place rewrite (1) to (4) as 


(1’) t— 9 >Vm+ 1G — yw) 
(2’) tt g> Vmt+ 1G + yoms—w) 
(3’) t— 9 <Vmt+ 1G — Yomt_w) 
(4’) t+ 9<Vm+1G9 + yw); 


where g is the average of the y;. The relative efficiencies of these tests with 
respect to the corresponding Student ¢-tests are determined and the simplicity 
of the computation necessary for their application is outlined. The method of 
attack having been sufficiently indicated by the development of this special 
class of tests, more general tests based on order statistics are stated but not proved 
here. 


3. Statement of the significance tests. Let each of 2, y1,..., Ymbe 
distributed independently of all the others, x according to N(», o”) and the y; , 
(i = 1,---, m), according to N(u, o°), where the notation N(é, o°) signifies the 
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normal distribution with mean £ and variance o°. As above let y;,) denote the 
uth largest of y:,---, Ym. The one-sided significance tests are then stated as 
follows: 
If 
K ‘ 
— Ye (Kz > 0) 
(5) ; . 
x —" 
xz > K d = K. Y (m+1—u) (Kz < 0) 
accept the alternative » < v, otherwise accept the hypothesis tested, namely 
that u = v. 
If 
K js 
K, Y (m+1—u) (Ke > 0) 
le Ky . 
P< > Ys — > Yuu Ko 
<M FY (Kz < 0) 


(6) 


accept v < yu, otherwise accept v = yu. 
The constants A; and Ke are given by 


(7) Ki=m+1ltvVm4+1, Ke = -1 $= Vm+1, 


where all upper signs or all lower signs will be chosen so that to a given value of Ky, 
there is but one value of K.. This rule for the choice of signs will hold through- 


out the paper. 

It is to be noted that (5) defines two separate significance tests of the hypothesis 
a = v against the alternative » < v depending upon whether it is decided to use 
the positive or the negative value given for K.. A similar statement applies 
to the two significance tests defined by (6). 

Each of these four significance tests can be shown to have the same significance 
level, which is determined by the values of uw and m. Denote this significance 
level by a(u,m). Then it can be demonstrated that 


a(1,m) = (3)”, a(2,m) = (m + 1)(4)” 
a(3, m) —_ (m? +m + =a)". a(4, m) = 1(m° + 5m + 6)(4)"77, 


The general expression for a(u, m) is given by (12). 
It is to be observed that the application of these tests is independent of the 
parameters of the normal populations in question. 


4. Analysis. An analysis will be given for the development of the significance 
test in which the alternative is 1» < v and K. > 0. The developments of the 
properties of the other three tests are almost identical with that for this case and 
will not be given here. 
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Now consider this analysis. Let 


= - 
go = BE, 


o 


Then 2’ and the y; are independently distributed according to N(0, 1). 
1 ee, — » 
y= —>+(Kiy — > v: + Kez’ }, (u=1 
Ky i 
It is easily seen that 


9 1 9 9 
E(r.) = 0, E(r,) = KR? (Ko + Ki — 2K, + m), 
1 


1 2 
E(r.7,.) = K? (Kz — 2K, + m), (u # v). 
1 


Thus the condition which must be fulfilled in order that the r., be independently 
distributed according to N (0, 1) is that 


(8) K> — 2Ki+ m= 0. 


To insure that the r, are independent of » when u = + it is evidently necessary 
that 


(9) Ki -—- m+ Kz = 0. 


Solving (8) and (9) for A, and Kz one obtains (7). 
Restrict the r. by conditions (8) and (9) and let ru) be the uth largest of 
T1,°** ,lm-. From (8) Ki > 0; therefore 


1 . , es 
Tu = > | Kiyo — Devi t+ Kex | 
~ I 
where y’,,) is the uth largest of y;, +++, yn. Then using (9), 
1 ™m™ 
iw * > Ex — Soy + Ker + Kelp — |. 
Kio 1 


From the definition of the power function and (5) for K. > 0, it follows that 
the power function for this test is given om 


Power Function = i = Dt i “5 Yo «| 


0 < Kiyw — Dy: + Kor < = | 
1 


K. tf - 
— (u—v) <3 : <Kiyw — yi t+ Kor + Ko(u — v) <= | 
Kyo Kio a \ i 


(u on v) < lw < «|. 


1 
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The distribution function of the order statistic r;,, may be found in [1], from 
which it follows that 


m! 
(u — 1)!(m — u)! 


a ([- fy) iy) ({" Fy) iy) fle dz, 


Power Function = 


(11) 


where 


1 
fy) = a" 


—}y2 


Consider the value of the power function under the assumption that the hypothe- 
sis is true. Then » = vand from (11) the significance level of the test is given by 


m! 
(u— 1)!(m — u)! 


LL sm a) (fs ay “9 « 


The method used to eliminate o from the quantities required for the application 
of the significance test, therefore, is to have the limits 0 and ~ in the probability 
expression (10) for the power function when the hypothesis is true. Suitable 
significance levels are obtained by varying the statistical function r;,.. by means 
of the selection of the values of u and m. 


a(u,m) = 
(12) 


5. Comparison with Student ¢-test. The test considered is that of a single 
sample value on the basis of m other sample values. Hence, the corresponding 
Student t-test has m — 1 degrees of freedom. The probabilities of Type II 
errors for the Student ¢-tests are calculated for values of 


= - 
‘ett, * 


by use of the normal approximation given in [2]. 


Using this notation 
Ke Ks 1 
Ee? "8 * FE tt, 


and from (11) the power function for the significance test for which the alternative 
is un < vand Kz > 0 is found to be 


m! ” = “i 2 m—u 
(u — 1)!(m = u)! aca (ic iy) @ F(y) iy) "f@) dz. 


The probability of a Type II error for a given value of 6 is equal to one minus 
the value of the power function for this value of 6. 
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It can be proved that the other three significance tests have the same proba- 
bilities of Type IT errors as the one considered above. 

The numerical comparison of the two types of tests is contained in Table I. 
In each instance the significance level was chosen to be approximately .01. 

The process of increasing the size of each sample by a given percentage has 
practical meaning if each of x, y;, --- , Ym equals the sum of r sample values. 
For example, if x, y1,--- , ym each consist of the sum of ten sample values, 
increasing the sample size by 30% would amount to letting z, y:, --- , ym each 
equal the sum of thirteen sample values. The case where each of x, y: , --- , Ym 


TABLE I 


% | Signifi- | ility , 
Degrees of o pone cae | Probability of Type II Error 
Size Level |§=—1| joa G10 = —4 
t 5 0 | .0156 | .919 | .750 | .477 |) .215 
OS. | 6 0 .0156 | .919 | .752| .506 | .276 
OS. 5 5 0156 | .916| .742| .486 | .256 
O.S. 5 10 0156 | .914). .469 | .239 


t 0 .0107 | .930 | .735 | .413 | .142 
OS. 0 0107 | .936 | .782 | .527  .270 
OS. | | | 20 0107 | .927 | .738 | .448 | .191 
OS. | 30 0107 | .921 | .715 | .411 | .161 





tf | | 0 .0110 | .920 | .699 | .358 | .106 
OS. | 0 .0110 | .933 | .771 | .492 | .245 
OS. | 30 0110 | .919 | .717 | .378 | .139 
OS. | 40 .0110 | .913 | .679 | .353 | .119 


- 7 0 .0107 | .919 | .688 | .337 | .092 
OS. | 16 | O .0107 | .938 | .765 | .488 | .234 
O.S. 16 | 40 | .0107 | .917) .687 | .351 | .111 
OS. 16 | 50 .0107 | .912) .664 | .310 | .090 


























equals the sum of r sample values will be treated later and will be shown to be 
a particular case of the one analyzed above. 

In Table I the order statistic tests (O.S.) are calculated for cases where the 
size of each sample is increased by the same percentage. This amounts to saying 
that the amount of information used for the test has been increased by this 
percentage. This method furnishes a quantitative estimate of the relative 
efficiency of the order statistic test as compared with the corresponding Student 
t-test. For example, if 30% more information is required for the order statistic 
test to have the same probabilities of Type II errors as the corresponding Student 
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t-test, then the order statistic test will be said to have a relative efficiency of 


—_—- 
—_" 


Examination of Table I shows that the order statistic tests have the approxi- 
mate relative efficiencies listed in Table II. These relative efficiencies can be 
shown to be approximately the same as those for other significance levels. 


6. Computation required. Since application of the order statistic test requires 
only the determination of one order statistic, the calculation of one sum, the 
multiplication of each of these quantities by given constants and the subtraction 
of the resulting values, the amount of computation required for application of the 
order statistic test is obviously much less than is necessary for the application of 
the corresponding Student t-test. 

If the test is applied continuously from one sample to the next, as in quality 


control work, the value of >> y; can be calculated by a continuous process. For 
1 


let the sample values be taken in the order y; , --- , Ym, 2, Where x is the new 


TABLE II 


&% Increase in Sam- 


m | Significance Level Relative Efficiency 








ple Size 
6 | 0156 5 95% 
10 | 0107 | 25 80% 
13 | 0110 35 74%, 
16 | 0107 43 70% 





sample value which is to be tested on the basis of the previous m sample values 
Yyi,°**,Ym- Then « for the present test becomes y, for the next test; ym be- 
COMES Ym—1 3 °** 3 Y2 becomes yi, and y for the present test is no longer used. 


The value of x will be furnished by the next sample value drawn. Thus, >> y; 
1 


m 


for the next test is calculated by adding x — y: for the present test to >> y; for 
L 


the present test. The order statistic can be easily determined from a plot of the 
sample values which is also applied continuously from one sample to the next. 


7. Generalization of results. The derivations given above are immediately 
applicable to the case where x represents the sum of r sample values from a popu- 
lation with distribution N(v’, 0”), and each y;, (¢ = 1, --- , m), equals the sum 
of r sample values from a population with distribution N(u’, 0”). Then x would 
be distributed according to N(rv’, ro”) and the y; would be distributed according 
to N(ry’, ro”). These distributions are of the form N(v, o°) and N(u, o’), where 


72 


p= rp’,v=rv' anda = ro". 
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If x equals the sum of r sample values from a population with distribution 
N(y, o°) and each y; , (¢ = 1, --- , m), equals the sum of s sample values from a 
population with distribution N(u, o’), the significance tests are derived in a 
similar manner and can still be stated in the forms (5) and (6), but the values of 
K, and K, become 


Ky = m+" + “(m+*), K,= — t= F y/mt 


The power function for the test in which the alternative is uy < v and K2 > 0 


axe 


Ky 
is found by replacing = Kio (u — v) by = (u— v)in (11). The significance 


level of each of the four tests is again wae by (12) and it can be shown that 
each test has the same power. 

To this point all significance tests considered have consisted of testing a new 
sample on the basis of m previous samples used as order statistics. In some 
cases, however, it may be desirable to utilize additional samples in the test but 
not as order statistics. These sample values can be gathered together in a 
summation term in which values from different samples are given relative 
numerical weighting. This procedure can be used to emphasize those sample 
values which appear to be more important from practical consideration with 
relation to those which seem to have less importance. The determination of 
what relative weighting scheme to use is to be decided by the person applying 
the test and is not considered as a problem of this paper. The significance 
tests with this property can be stated as follows: 

Let each of 22, yw, Ze, (@ = 1,°°-, 736 = 1, sc=1,---,n3t= 
1,---,m;j =1,---,n), be diateiivated ti of all the others, the 2, 
according to N(», o°) and the ys and z;- according to N(u, a). Define y, = 


> Yu, (u = 1,-++-, m), and let y.) be the uth largest of y:,---, Ym. The 
b=1 


one-sided significance tests are then given by 


If 





2 tc > =e ‘oto (Ke > 0) 


9 
- 


¥ ta > ovr ¥, (Kz < 0) 








Ko 
accept the alternative u < v, otherwise accept u = »v. 
If 
> Le < =e Vi (Kz > 0) 
pm Xa < =f Viati~w (Ke < 0), 
1 2 


accept u > v, otherwise accept u = v. 
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The quantity V, is on by 


m K n C; nj 
We = ey ve ew + DE (E Ae), 


i=1 j=1 V 1; \c=l 





where the constants C;, (j = 1, --- , »), are defined by C; = w,n, the w; being 
given positive weights. The values of n, Ki and Ke are 


A ! 
B 5 Ke i m 
EVE wa a g/m) 


m+ A?/B’ m + A?/B 
m + A?/B 


ky = — 5 
| Bom + A’/BY + A’ (:)| 
. {Bm /* + +m (1) + Bm | Bim + A?/B)? + A? (2) 


n 
9 
V7; 5 B=) vw}. 
1 


The quantity 7 in the eines for the C; is not considered given but is de- 
termined in the derivation of the tests. The two equations corresponding to 
(8) and (9) then contain three undetermined quantities 7, Ki, and Ky. Thus 
there are infinitely many possible selections of these quantities, each selection 
resulting in a valid significance test. The values of 7, Ki; and Ke given above, 
however, are the ones which result in the maximum power function and conse- 
quently the smallest probabilities of Type I] errors. The power function for the 





where 








K. 
Kio 





test in which the alternative is » < vand K, > Ois that given in 


-(u — v) replaced by Av 


cys Ty —v). The significance level of each of the four 
1d 
tests given above is still that of (12). It can also be shown that each of the tests 


has the same probabilities of Type II errors. 
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CHAINS OF RARE EVENTS 


By Fre.ix Crernuscut AND Louis CASTAGNETTO 


Harvard University 


1. Summary. The negative binomial distribution of Greenwood and Yule is 
generalized and modified in order to obtain distribution curves which could be 
used in many concrete cases of chains of rare events. Assuming that the num- 
bers of single, double, triple, and so on, events are distributed according to Pois- 
son’s law with parameters 1, A2, As --- respectively, and that A, is given by 
A= r s 

s! 
considered relation \,; , for convenient values of a, first increases with s and after 
a certain saturation value of s starts to decrease. A relation of this type is very 
suitable for studying the distribution of score in a match between two first class 
billiard players, the probability of accidents on a highway of dense traffic, etc. 
The general methods of finding the distribution curves for arbitrary relations 
between the )’s are indicated. The method of steepest descent is applied to find 
an acceptable approximation of the distribution function; and the advantage of 
this method is pointed out for other similar cases, in addition to the concrete one 
which was developed, in which the method of direct expansion into power series 
becomes inapplicable. 





the probability of obtaining M successful events is studied. In the 


2. Introduction. M. Greenwood and G. U. Yule [1] have deduced the nega- 
tive binomial distribution from a compound Poisson law: 

P(m, dX) = ll e, 

m! 


where \ itself is a random variable distributed according to Pearson’s law of 
type III: 


P(A) dd = B** = edn. 
Qa: 
They obtained the distribution 


P(m) = (1 — a)*™™ (a + m)! a” 


a! m! ? 


= 
6+1° 


of x” in the expansion of: 


a+l x poo 
a-am(1- 555)" 


1 Research done at Harvard Astronomical Observatory as Guggenheim Fellow. 
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where 1 — a = As is easily seen, P(m) is given by the coefficient 
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R. Liiders [2] has arrived at a negative binomial law by the following considera- 
tions. Certain events, like automobile accidents, can be classified as simple or 
multiple according to the number of units involved. Assume that the numbers of 
single, double, triple, and so on, events are distributed according to Poisson’s law 


with the parameters A; , \2, As, --+ , respectively. The probability of obtaining 
m single, m2 double, n; triple, --- successful events is (assuming mutual inde- 


pendence) 










ny no 


(1) P(m, 2,13, °° $¥1,A2,A3, °°°) = — — 
Ny. Ne, 





—(A,+Ao+---+) 
-e 1 2 ’ 





The total number of successful events is 
(2) n= ny + 2no+ 3n3 + --- + ing t---. 


The probability of obtaining n successful events is given by the sum of all expres- 

sions (1) subject to the condition (2). This sum is given by the coefficient of 
a> . 

x” in the expansion 


(3) Eager Gere, 

















Now if the parameters A, satisfy 
(4) 


one finds 





12,1) j P 
(6) P(n) = (1 — a)? MM + 1) (B+ n — 1) . 


a 


Taking ~ equal to a + 1 one gets Greenwood and Yule’s distribution in the 
a 


form given above [3]. The negative binomial law has useful applications, for 
instance in some cases of accidents of workers in factories. It is proved that 
with values of a near 1, the most probable value for n is n = 0 and the average 
value is a finite number different from zero. Therefore the distribution will be 
in some way similar to the distribution of the scores in a match between two first 
class billiard players whose most frequent scores are zero and their average may 
be, say, 50. In the case of the Poisson distribution the most frequent score and 
the average score should be nearly the same. The relation (4) does not provide 
an adequate description of many practical distributions. For instance, in a 
match between two first class billiard players, the probability of making a second, 
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third, --- , point will be considerably greater than the probability of making 
the first. With the relation (4) \, is a decreasing function of s, while we shall 
investigate cases in which ), first increases with s and after a certain value of s 
starts to decrease. As other examples of distributions of similar types we shall 
mention the following: On a highway with dense traffic at high speeds the prob- 
ability of only one car being involved in an accident may be smaller than the 
probability of having several cars involved. Something similar may be said for 
the cases of work accidents in factories where the work of one is interconnected 
with the work of others. In many cases of telephone calls (business transactions, 
organization of meetings, etc.) the sample Poisson law is not suitable to interpret 
the distribution of ‘calls, since one call may increase the probability that the 
called person makes one or more calls. 

The purpose of this paper is to treat the problem when, instead of (4), we take 
other expressions which may in a better way describe some processes such as the 
ones which we have referred to. 


3. Modification and generalization of the scheme of Greenwood-Yule and 
Liiders. According to the relation (4) \, is a decreasing function of s and the 
parameter a must be in the interval 0 < a < 1. Instead of (4) we shall use 


e~4 


(7) oh —, 
Ss. 


where a may have any positive value. In particular for a = 0 our case reduces 
to the Poisson case. 
From (7) it follows that 


Asai a 
8 = — 
(8) As s+1 


and we see that \, increases with s for 1 < s < a and decreases for s + 1 > a. 
Substituting from (7) in (3) we get 


(9) f(x) a g ioe errlaress 
As the probability of obtaining n successful events is given by the coefficient 


. Pz . ° e ° 
of x” in (9), we shall expand ce“ in power series (a, 8 being two arbitrary 
constants). We have 


(10) en? 
Now 


(11) 
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where [4] 


yi(a) = a 
y(a) =a +a 
(12) y3(a) = a+ 3c°+a 


eereereeree eee e eee eeeee 


i=l (2! 
Here we use the notation of differences of zero: A‘0". We have 
(13) em? a e* E + po Yn iP "| 

n=1 é 


or 


n n 


(14) e7"F = ot 1 +> 7 Zz MJ a‘|. 
n=1 he t=1 . 


Now in our case 


ee 
| & 





(15) sot, pe 
a 
whence 
(16) P(n) = ec CUO D oe . (*) , n> 0. 
n!4=1 i! \a 
(17) eae. forn = 0. 


We have in particular 
P(1) = AP(0) 


a 
P(2) = rT (Ai + adx)P(0) 
1 9 9 
P3) =35 (Al + 3da + da’) P(0) 
1 _ : 
~~ =< (i + 6d%a + 7AZa" + d10°)P(0) 
1 9 9 
P(5) = 5 (At + 10\ia + 2530 + 15dja*® + Axa‘) P(O) 
1 9 °° & 
P(6) = 3 (af + 15\2a + 65dAia”? + 90AZa® + 31dja* + dA1a’)P(0) 
(18) - 1 7 6 5 2 ~ 43 3.4 > 2 5 
P(7) = 71 (Ai + 21\ia + 140Aia° + 350Aia° + 301A\ia° + 63Xia 
+ d,a°)P(0) 
1] - 9 & 2 cn 5 
P(8) = at (At + 28ria + 266r$a" + 1050A7a* + 1701dA{a* + 966Xjia' 


+ 127rja° + rxa’)P(0) 





CHAINS OF RARE EVENTS 57 


1 5 } 
PQ) = ry (Al + 36d\%¢ + 462 ria”? + 2646r$a* + 6951d3a* + 7770d0° 
+ 3025Aja° + 255via" + r1a*)P(0) 

se , , 

P(10) = 59, Ar + 45d1a + 750A1a" + 5880Aia" + 28827Aa* + 42525)5a° 


+ 34105via° + 9330Aja’ + 511Xja* + d10’)P(0). 
For A; = ait follows that 


(19) P(0) =e«** 
(20) P(n) = 4 — yn(1) 

» P(n) = ee” E ae yi —, wi) |. 
Particular values of (20) are 


P(1) = aP(0) 
P(2) = a P(0) 


~ 3 
P(3) = a P(0) 


4 
P(4) = ~ P(0) 


" 2a° 
P(5) = se P(O) 


6 
P(6) = ** PO) 


8770" 


7! 


P(7) = P(0) 
4140a° 
8! 

9 
pe) = 2S" Po 
1159750" 
10! 


P(6) = P(0) 


P(10) = P(0). 


In Figure 1 we have graphed the curves P(n) for the values = 1;\,= 0.1, 


A. = 1, \1 = 2. We see in particular, that for 41 = 1 we have P(0) = P(1) 
and for \; = a = 1 we have P(0) = P(1) = P(2). 
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4. Application of the method of steepest descent. If \, is not given by (7) 
the above method of direct expansion of f(x) into a power series, usually becomes 


p 



























































inapplicable. In many cases it is possible to use instead the method of steepest 
descent [5] in order to obtain approximate values for the coefficients of x” in the 
relation (3). 


\ 
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As is well known, if f(z) is an analytical function we have 


‘ . 1 “ f(z) 1 X(z,y)+iY (z,y) 
(22) coeff. of 2’ "2? on dz - - f € dz 
—— I) , 
where X + 7Y = log 4: and the integral is taken along any closed path around 


the origin. 
To evaluate the integral (22) we shall follow a method similar to the one used 
by R. H. Fowler [6]. Putting z = pe“ the relation (22) may be written: 


(23) Coeff. of 2” = - (00) ae 


cr gn 





where the value of p is arbitrary. We shall put in particular p = 2 where 2» is 
the root of 





JF (Xo) 
f(x) 


For most functions which interest us*—* — © as x—0 and as x — K (a positive 
x 


(24) Lof' (xo) a 


number which in some cases may be infinite) and the second derivative is always 
positive. Consequently f(x)/x” has only one minimum between Oand K, and (24) 





has therefore only one root 2). Developing log flr0e'") into powers of a, (24) 
p ac 
becomes 
+r 
(25) coeff. of x” _ Ms f(x) [ eae" " (20) /2) a2 +ig(zq)aF+h(zo)at+--- da, 
T 
where 


o(z) = log £2) 


r 


3 

in absolute value very rapidly in the neighborhood of x). For small values of a 

we may therefore in a first approximation drop all other terms. Also, as this 

first term tends rapidly towards zero one does not appreciably increase the error 

by replacing the integral from — z to + z by the integral from — © to + o. 
In such cases we have, therefore, the approximate formula 


In the case where ¢” oo > ° > 1 the first term in the exponent in (25) increases 





. 1 f(x) C eee f(x») 
26 tt es Oe Ne da = =a a - 
(26) coeff. of z ; ; e da xo 4/ 2a" (a) 


We are now in a position to deduce asymptotic values for the probabilities P(n) 





60 FELIX CERNUSCHI AND LOUIS CASTAGNETTO 


which we have previously calculated directly. In fact, for f(z) defined by (9) 
we obtain from (26) for large n 


e Gulayes erlazo 


27 A tiids cntanantinees:- sintgieningiaeiiieeiimess 
- in) V2Qr xeV/nlam+1)’ 


where 2p is given by 


In particular for \; = a and putting ax) = y it follows that 


n evo 
(28) P(n) ~ 0.3989 (*) ar 
Yo V n(yo + 1) 

Comparing the numerical values given by the relation (28) with the exact values 
we find that even for n = 4 and A; = 1 (28) gives an approximation with an error 
of about 5%. 

Formula’ (26) can also be used to evaluate the numbers y,(1) defined by (12) 
fora = 1. Relation (13) gives fora = B = 1 


ec = el +> wet) 2*| 
n=1 7: 
and therefore 


eyn(1) 


Coeff. of x” in expansion of e* = 
nN. 


Putting f(z) = e* and using Stirling’s formula for n! we have from (26) 
e 1 1 
(1) e*Lotz-O+5)] 
Yn ey ———__—_ ee 
Vit1 


2 Applying this relation to f(z) = e* one obtains immediately Stirling’s Formula: 


f(z) _ 


g(z) = log =2z-— nlogz 


2” 


¢’ (z) 


¢” (z) 


0 
2 
1 e n 
n! n V/ 2en 


Also relation (26) is useful to find other symptotic expressions; e.g. for f(z) = (pz + 9)” one 
obtains for n — « the Laplace-Gauss formula. 
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where 2p is given by 


For n = 4, x) = 1.202 and y,(1) ~ 15.56. As the exact value of y,(1) is 15 we 
obtain in this case an error of less than 4%. 

Repeating the calculations for n = 6, x) = 1.432, we find that y(1) is given 
with an error of less than 3%. 
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NOTES 


This section is devoted to brief research and expository articles, notes on 
methodology and other short items. 
(a en RRR I 


A NOTE ON SOME SINGLE SAMPLING PLANS REQUIRING THE 
INSPECTION OF A SMALL NUMBER OF ITEMS 


By J. H. Curtiss 


Cornell University’ 


In the practical application of sampling inspection plans it is often necessary 
to restrict the number of items (pieces, samples) inspected from each inspection 
lot to a relatively small number. For example, if many vendors are supplying 
a manufacturer with small lots of various kinds of material, the manufacturer 
will usually wish to have some check on his suppliers; however, he cannot afford 
to inspect large numbers of items from each lot. If sampling plans requiring 
the inspection of a small number of items are used, it is advantageous to know 
the characteristics of such plans. The present note offers several single sampling 
plans with sample size n < 25, together with their operating characteristic 
curves (OC curves) and average outgoing quality curves (AOQ curves). 

Single sampling plans for large lots may be described by the number n of items 
to be inspected, and the rejection number r._ If r or more of the items inspected 
fail to meet some predetermined standard the lot is rejected; if less than r items 
fail to meet the standard the lot is accepted. 

The OC curve (see Figures 1, 1A, 3 and 5) shows the relationship between the 
probability of rejecting a lot and the true quality of the lot. The quality of the 
lot is often measured by the ‘“‘percent defective” in the lot; i.e., the proportion of 
material which does not meet some predetermined standard. It should be noted 
that the definition of OC curve given here is only one of several in common use. 
In particular, the vertical axis often gives the probability of ‘‘acceptance’’; such 
a treatment would amount to an “inversion” of the curves given here. Another 


1 The material in this note was originally prepared as an office memorandum for the use 
of engineering technical personnel in a Government Bureau. The author wishes to express 
his appreciation to Mr. C. F. Mosteller for extensive editorial work on the original memoran- 
dum which has resulted in a revision more suitable for publication in the Annals. 

2 The OC and AOQ curves are often adequate to analyze single sampling plans because it 
is not customary to curtail single sampling even when the outcome of the inspection (accept- 
ance or rejection) is determined before all the items are inspected. In other kinds of sam- 
pling plans (double, multiple, and sequential) where curtailing is often used after the first 
sample, curves for the average amount of sampling are also useful. However, if one is 
interested in the average amount of inspection, including detailing, as a manufacturer 
inspecting his own product might be, curves for the average amount of inspection would be 
useful in connection with any sampling plan. 
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common form would have the ‘‘percentage of presented lots (of quality indicated 
_ on the horizontal axis) that will be rejected (accepted)” as its vertical scale. 
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FIGURE 1 


It has been assumed that the lots are so large that the samples can be regarded 
as being drawn from an infinite population, or to put it another way, that there 
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(abbreviated AQL), and in published sampling tables by Dodge and Romig,’ a 
rejection probability of 90% is associated with a quality value which they call 
' the “tolerance percent defective.” 

The average outgoing quality curve (AOQ curve, see Figures 2, 4 and 6) of a 
sampling plan shows the relationship between the long run average quality of 
the outgoing product after sampling inspection and the quality of the product as 
submitted for inspection. The quality of the product in each case is usually 
measured by the “percent defective” in the product. 


SUPPLEMENT TO FIGURES 1 AND 1A. 


Quality of Lot (measured in percent defective) corresponding to various probabilities 
of rejection, for sampling plans in which a lot is to be rejected 1f one or more 
defective items are found in a set of n random sample items 


Probability of Rejection 


01 05 25 50 15 .90 


percent percent percent percent percent percent 
01.00 05.00 25.00 50.00 75.00 90.00 
00.50 02.53 13.40 29.29 50.00 68 .38 
00.34 01.70 | 09.14 20.63 37.01 53.58 
.25 01.28 06.94 15.91 29.29 43.77 
.20 01.02 05.59 12.95 24.21 36.90 
17 00.85 .68 . 20 .63 31.87 
.14 00.73 .03 : 17.97 28 .03 
12 00.64 .53 ‘ 15.91 25.01 
11 00.57 14 ‘ 14.28 22.57 
00.10 00.51 .84 ; 12.95 20.57 
00.09 00.47 .58 : 11.84 20.40 
00.08 00.43 od ‘ 10.91 17.46 
00 .07 00.36 .03 ‘ 09.43 15.17 
00.06 00.32 .78 04.24 08 .30 13.40 
00.05 00.26 01.43 03.41 06.70 10.88 




















The average outgoing quality is dependent upon the treatment of rejected lots. 
If rejected lots are cast aside once and for all, and are never resubmitted with all 
deficiencies corrected, then the average quality of the outgoing product after 
the sampling inspection tends to be the same as the average quality of the product 
submitted for inspection (provided that the quality of individual lots does not 
fluctuate too wildly). The only direct effect that the sampling inspection has 
in this case is to reduce the amount of the product which isaccepted. However, 


3H. E. DopGe anp H. G. Romic, Sampling Inspection Tables, Single and Double Sam- 
pling, John Wiley and Sons, Inc., New York, 1944. 
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the situation is very different if a rejected lot is always resubmitted with all de- 
fective material removed or replaced with non-defective material. In this case, 


AVERAGE OUTGOING QUALITY CURVES FOR 
SAMPLING PLANS IN WHICH A LOT IS TO BE 
DETAILED IF ONE OR MORE DEFECTIVES ARE 
FOUND IN “9” ITEMS 


(PERCENT DEFECTIVE) 
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FIGURE 2 


the average quality of the outgoing product after the sampling inspection will 
tend to be better than the average quality of the product submitted for inspec- 
tion. In fact, if the submitted quality is very poor, the average outgoing quality 
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will theoretically tend to be very good, because so many of the lots are rejected 
and then detailed. 











77) 

WU 
ONO OEE 
i 
TOME ZL 
Mt 
TNL ZI 
CMT DOLE 
(MMTV CZ 
CMT ZEEE 

Moe 
WMT ZT 
MMVI CVS 
TAZ eT 
Wee Oe 
Mer) 











Under the _ on that each rejec Sree will be detailed and resubmitted 
with e de fick corrected, a typical average outgoing quality curve starts 
at the origin, rises iia a maximum, et lls off more slowly. The maxi- 
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mum average outgoing quality is called the average outgoing quality limit 
(AOQL) of the plan. 


AVERAGE OUTGOING QUALITY CURVES FOR 
SAMPLING PLANS IN WHICH A LOT {S TO BE 
DETAILED IF TWO OR MORE OEFECTIVES ARE 
FOUND IN “»" ITEMS. 
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PERCENT DEFECTIVE IN PRESENTED LOTS 
FIGURE 4 
The graphs give the operating characteristic curves and average outgoing 


quality curves of certain single sampling plans. It is assumed the samples are 
taken at random without replacements from a lot which contains at least 10 times 
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the specified number of samples. In the case of the average outgoing quality 
curves, it is further assumed that rejected lots are always detailed and resub- 
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FIGuRE 5 


mitted with all the defective material replaced by non-defective material. An 
approximation has been made in the calculation of the AOQ curves which makes 
them upper bounds. If it is assumed that many lots of size N of exactly the 
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same quality of product p are being produced and that we are taking samples of 
size n from them, then it follows that AOQ = p P, (1 — n/N), where P, is the 
probability of accepting alot. Theterm n/N has been omitted; therefore these 


AVERAGE OUTGOING QUALITY CURVES FOR 
SAMPLING PLANS IN WHICH A LOT IS TO BE 
OETAILED IF THREE OR MORE DEFECTIVES 
ARE FOUND IN “72” ITEMS. 
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PERCENT DEFECTIVE IN PRESENTED LOTS 
FIGURE 6 
AOQ curves are too high, but are a good approximation provided only that the 


ratio of sample size to lot size is small. The condition mentioned earlier in this 
paragraph requires that n/N < 0.1. 





ON THE USE OF THE SAMPLE RANGE IN AN ANALOGUE 
OF STUDENT’S t-TEST 


By JosepH F. Day 


Bureau of Ships, Navy Department 


Let 21, +++, %w represent independent observations on a variate x which is 
normally distributed with mean p and variance o°. Assuming no prior informa- 
tion about the value of either parameter, let Ho be the hypothesis that u is equal 
to or less than a specified quantity uo. The classical test of this asymmetrical 
form of ‘‘Student’s” hypothesis [1] is based upon the statistic 


t= VNC — w) / / z(@ — 3) 
N- 1 


the region of rejection being defined by the relation ¢ > ¢,. 

For certain applications of a routine nature, however, such as production line 
inspection, the usefulness of this test is rather seriously impaired by the arith- 
metical work involved in the computation of ¢. For this reason Dodge [2] and 
Knudsen [3] among others have proposed tests of Ho based on a statistic of the 
form 


Gm xt — Mo 
w 

where w is the sample range. It is the object of this note to show how the 
probability distribution of G can be obtained with the aid of the distribution law 
of w tabulated by Pearson and Hartley [4], and to present some numerical results 
which indicate that the power of the resulting test is the same for all practical 
purposes as that of ‘‘Student’s” ¢-test for sample sizes N < 10. 

The calculation of the percent points of the G distribution is greatly facilitated 
by the following result, which does not appear to be generally known: 

Lemma: If & and w represent respectively the average and the range of a sample 
of N independent observations on a normally distributed variate x, then = and w are 
statistically independent. . 

Proor: No generality is lost by putting » = 0,¢ = 1. The joint character- 
istic function of and the }N(N — 1) differences x; — 2%, (7 < k), is then 


0 ele > 
—(N/2) —tyzptic dD ajti dS t jx (zj—ze) 
elt; tye) = (20) [ gta A eer ay... dey 
— 30 


where the summation runs from 1 to#V on each index with the understanding that 
tx = Ofor 7 >k. The usual process of completing the square in the exponent 
then yields 
t : 2 t 2 
—tp| <+5 (¢ jx—te;) te {2;-i] S+5(t je—te;) |! 
oe ba [ pte ces th | (on) [ 7 z\ i E SC jet ] la 


. 
X1°*+ dty. 
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° 4 (2+-i6)2 ° Is? 
e dx = € dx, 
_ oe 


—tr[ ftEtta—ts) | 
g(t, tx) “=e * ” ’ 


this reduces to 


which readily factors into 
2 
—(#2/2N) 4D] Dt jem tej) 
gilt) + go(tjx) = € “e€ L; ] . 

Hence the differences x; — a; are jointly independent of <; and since the range 
w is a Borel measurable function of these differences (i.e., w = max | x; — 2; |) 
it follows that # and w are independently distributed. 

The foregoing lemma is in fact capable of further generalization as follows: 

Let g(x1,-+- , tx) be a function which, like the range, has the property that 
g(t%i + a,+++, ay + a) = g(ti, +++, Xn). The characteristic function of Zandg 
can then be written in the form 


© 


g(t, d) a FOr gmt eo BZ (2-8 (t/N))?-+id9 (2) dx, oe dxy wa gr(t)-W(t, d). 


— 6 


Now if the second factor y is analytic in ¢, it must be a constant as far as varia- 
tion with ¢ is concerned; for by putting t = iNa (a real) we have 


2 
_— (e_\~(N ft - 2452 
¥(iNa, d) (29) (N/2) [ € $5 (z+a)?2+idg (z) dx,++- dxy 
00 
0 
_—(N/ id 24; 
(Qa) (N/2) [ e $5 (z+a) +irdg (z+a) dx, oo dxy 
rr 


(2m) [ABH dey. dey = old). 
Therefore y(t, \), being constant in ¢ along the axis of imaginaries, must be free 
of ¢ throughout the complex plane. The joint characteristic function of % and 
g is thus equal to the product of their respective characteristic functions, so that 
the two variates are independently distributed. In particular this result shows 
that in the normal case each of the moments about the sample mean is distributed 
independently of Z. 
Returning now to the distribution of G, we see that for G. > 0 


(Z—-u, ol. piVYNE-wWle , 
" - > a.) = py vas > w/o 


0 2/\/NG. 
| / e f(z)h(w)dw dz 
z=0 “w 


y == () 


= | se)P@/VNG.) az 


where f(z) is the normal probability function for » = 0, o = 1, and P(u) is 
the value [4] of the probability that the range of a sample of N observations 
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will be less than u standard units. For selected values of N Table I gives the 
value G.o5 such that 


Py{(Z = Mo) /w > Gos | B= bo} = .05. 
TABLE I 
Upper 5% points for distribution of G 
Gios 


.88 
39 
26 
19 


These values were calculated by Simpson’s rule and checked by Weddle’s rule. 
To evaluate the probability that G will exceed G, when u ~ uo we may write, 
followmg Johnson and Welch [5] 


= — wo _ VN(E — n)/o + VN(u — wo)/o 2 
Ww V/Nw/o 


The required probability is then given by the integral 


[-_s0( <8 ) ds, a= SN(u — wo)/c. 


Table II is a comparison of the probability that G will exceed Go; with the 
corresponding probability that ‘“‘Student’s”’ ¢ will exceed to; for various values of 
(u — uo)/o, the case N = 3 being chosen because the non-central ¢ distribution 
is formally integrable in this case. 


TABLE II 
Probability of rejection for G and for t, (N = 3) 


(u — po)/o | P{G > .88} | P{t > 2.92} 


.00 | .050 | .050 
.50 151 | 151 
75 .229 | . 230 
1.00 | .322 .322 


Similarly for N = 10 it was found that when p — yuo = .383¢ (i.e., when a = 
1.21) the probability that G will exceed G.o5 is .296; the corresponding probability 
for t is given by Neyman and Tokarska [1] as .30. 

Pending the construction of more adequate tables of the percent points of the 
G distribution, it seems worthy of note that for N < 10 the values of Go; can 
be estimated quite accurately by multiplying the corresponding upper percent 
point to, by the factor 
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/=(x — £)? 


V NEw! 


where E[w] is obtainable from Tippett’s table of the mean range [6]. Estimated 
values of Go; for sample sizes from 3 to 10 are listed for convenience in Table ITI. 
The approximate values of Gos proposed by Knudsen [3] were calculated in 
essentially this fashion, using however the square root of the expected value of 
=(x — 2) instead of the expected value of ~/3(z — Z)?, and employing percent 
points of the ¢ distribution determined by the relation P{|t| > ts} = .05~ 
instead of P{t > tos} = .05. Thus though the agreement between the values 
listed in Table III and the corresponding computed values shown in Table J 
is extremely good, the discrepancy between these values and those given by 
Knudsen is rather large. Any error committed by using Knudsen’s table will, 


TABLE III 
Estimated upper 5% points for distribution of G 


N | G.os 


. 882 
-526 
385 
.309 
. 260 
.227 
. 202 
. 183 


3 
+ 
5 
6 
7 
8 
9 
0 


1 


however, be on the conservative side, in the sense that the probability of un- 


justly rejecting Ho will have somewhat less than half the value indicated in that 
table. 


REFERENCES 


[1] J. NEYMAN AND B. Tokarska, “Errors of the second kind in testing ‘Student’s’ hypoth- 
esis,’”’ Am. Stat. Assn. Jour., Vol. 31 (1936), pp. 318-326. 

[2] H. F. Doneg, “Statistical control in sampling inspection,’’ American Machinist, Vol. 76 
(1932), p. 1130. 

[3] Lita F. Knupsen, ‘‘A method for determining the significance of a shortage,’’ Am. Stat. 
Assn. Jour., Vol. 38 (1943), pp. 466-470. 

[4] E.S. Pearson anp H. D. Hart ey, ‘‘The probability integral of the range in samples of 
n observations from a normal population,”’ Biometrika, Vol. 32 (1942), pp. 301-310. 

[5] N. L. Jounson anv B. L. Wetcu, ‘“‘Applications of the non-central distribution,” 
Biometrika, Vol. 31 (1940), pp. 362-389. 

[6] L. H.C. Tippett, ‘“‘On the extreme individuals and the range of samples taken from a 
normal population,’’ Biometrika, Vol. 17 (1925), pp. 364-387. 





AN INEQUALITY FOR DEVIATIONS FROM MEDIANS 


By Joun W. TUKEY 


Princeton University 
In a recent note in these Annals, Birnbaum and Zuckerman [1] proved that if: 


(1) X1, X2,---,X, are independent random variables with the same 
distribution (i.e., form a sample), 
(2) their common distribution is symmetric about zero, 


then 


E(| X1 + X2 +--+ + Xal) > o(m)-E(| XI), 


g(2k + 1) = o(2k + 2) = eS 


It is the purpose of the present note to extend this to the following, more 
general, result: 
THEOREM. I[f 


(i) Xi, Xe,-°-+*, Xn are independent random variables, 
(ii) the median of each X; is zero, 


then 
B(\Xi + Xo t +++ + Xa[) > BYR) + [Mel te + | Xa. 
It will be convenient to let d; = E(| X;|) and 
d= = Dds == EU Xs + [Xel te + 1X0), 


so that the desired inequality becomes 


E(| Xi + X2 +++: + Xal) => o(n)-d. 


— 


Define e; by 
é = [ xdF (x) , 
0 


where F(x) is the cumulative distribution function of X;. Since 


0 20 
de = BX) = — [ xdPi(2) + [ rdF (2) , 


75 
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it follows that 


0 


[ iiss me =~ &. 


The basic idea of the proof, which is common to both notes, is to divide the 
n-dimensional space of 21, %2,°**,%n into its 2” “octants,” break up the 
expectation of | Xi + X. + --- + X,| into the corresponding parts, and apply 
elementary inequalities. Let Os be the octant in which a set S of variables 
are < 0. From (4), (5) and hypothesis (ii) it follows that 7 


es, fx; >0 mds, 
af... fx Tar yay = ae : 
Os e—d&, fx<0 


mn Os. 


af... f Da TM aFiz) = De - Da=e- Da. 
Os t=l 8 8 


where e = > e;, and the second and third sums are over all d; for which 2; < 0 
in the chosen octant O;. The contribution of the octant Os to E(| Xi: + X2+ 
++ ¥.Dis 


Fl |X| I aF ye) > Pt (X a) TI are, | 


a go ia-) le sal 2 i : 
Ss 


For each value of s, there will be “ octants with s variables < 0. The sum 


of their contribution to E(| X; + X2 + --+ X, |) is 
1s! |. 1 |fn n—1 
I= ga Lle- Dal > gal(*)e-(@ 7 1)Da ’ 


where the inequality follows from =| a,| > | Za, |, and it is noticed that each 
d; occurs in C a ‘) different inner sums. Recalling that 24; = nd, this may 


1 
be written 


1 | i 
1e> geai("ie - ad), 
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Finally, 


n 


E()X1 — X2+ --- + Xal) = DI rer D("Vie— ad 


s=0 s=0 \ § 


> rr F (") tle = ad | + |e (n= 9a 


2s<n 


zor" 2. (") (n — 2s8)d, 
2s<n s 
where the last inequality follows from |a| + |b| > b— a. To complete the 
proof, it is only necessary to evaluate the last sum. One method of evaluation 
may be found in Birnbaum and Zuckerman’s note. 
If each X, = +1, each with probability one-half, then all of the inequalities 
of the proof become equalities. So that, in this case, 


E(| Xi + X2+ +--+ X,|) = o(n)-d. 


Since the limiting distribution in this case is a normal distribution with 
standard deviation n' and E(| X; + X2--- + Xn |) = (2n/m)*, it follows that 
this is the asymptotic value of ¢(n). 

The inequality of the theorem is only efficient when the E(| X; |) are of nearly 
the same size. In other cases it can often be usefully supplemented by the 

LemMa. If 
_ (i) Xi, Xe,-++,Xn are independent 

(ii) for each 7, either X; has median zero, or the sum of the means of the other X; 
is zero (this is implied by either (a) the median of each X; is zero, or (b) the mean 
of each X; is zero), then 


E(| Xi + X2 + +--+ X,|) > Max E(| X;)). 


The lemma follows from the case where n = 2, by applying that case to 








Y; = X;,, Y,= >) X,, 


isi 

















where the maximum of E(|‘X; |) is attained fori = 1 
The special case follows from the inequality 


|. + | > || + aersgn 1, 
since this implies 
E(| Xi + X2|) > E(| Xi |) + H(X2)-E(sgn Xi) = E(X) 


using first \i) and then (ii). 
In conclusion, it is interesting to note that the mean cannot replace the 
median in the hypothesis of the theorem. For let X;, X2, X3 be independent, 
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and take the values 1 (with probability 2/3) and —2 (with probability 1/3). 
X, + X2 + X; takes the values 3 (with probability 8/27), 0 (with probability 
12/27), —3 (with probability 6/27) and —6 (with probability 1/27). Hence 
E(| X;|) = 4/5, and E(| X: + X2 + X3|) = 48/27 = 16/9 = 4/3E(| X;)), 
which is not > 3/2E(| X; |). 
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ON THE INDEPENDENCE OF THE EXTREMES IN A SAMPLE! 
By E. J. GuMBEL 


New School for Social Research 


In a previous article [1] the assumption was used that the mth observation in 
ascending order (from the bottom) and the mth observation in descending order 
(from the top) are independent variates, provided that the rank m is small com- 
pared to the sample size n. In the following it will be shown that this assump- 
tion holds for the usual distributions. 

Let x be a continuous, unlimited variate, let (x) be the probability of a value 
equal to, or less than, 2; let ¢ (x) be the density of probability, henceforth called 
the initial distribution. The mth observation from the bottom is written ,,z 
and the kth observation from the top is written z;,. Thus, the bivariate dis- 
tribution Wa(mx, 2) Of mx and x; , is such that there are m — 1 observations less 
than ,,x«;k — 1 observations greater than x; and n — m — k observations between 
me and 2% . 

For simplicity’s sake write 


P(mt) = mb; — B(zx) 
P(mo) = mp} (2x) 
Then 
(1) Walme, Ti) = Cn P™  me(Be — mb)" “ox(1 — x), 


where 


: a n! 
(1’) oO eee 1)!(n — m — k)! 





In the expression (1) no assumption about dependence or independence of »« 
and 2; is implied except that these values are taken from the same population. 
The distribution (1) is now modified by introducing three conditions. First, 


1 Research done with the support of a grant from the American Philosophical Society. 
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that the two variates are extreme, namely that the ranks m and k are of the same 
order of magnitude and small compared to the sample size n. 


(2) n>>m~k = O(1). 


Furthermore it is assumed that the initial distribution ¢(z) is, for small and for 
large values of the variate, subject to L’Hospital’s rules 


(3) lim 22) = jim 2. _ g(t). g(a) 


w 0 Q(t) so BZ)’ seo. GE) v=o 1 — O(a)" 


Finally it is assumed that n is so large that the equality of the limits may be re- 
placed by the equality of the quotients. Then it is legitimate to write 


mP m® ’ oe 1—’ 





/ 
(3’) me _ mm, Pk see Yk 


Clearly, the three conditions do not imply any assumption about dependence or 
independence of the two extremes. 

From (1) the most probable mth value from the bottom, ,,u, and the most 
probable kth value from the top, 1% , are the solutions of 





m— 1 ame n-—m—Kk 
mP - mP D, — mP - 











7 n—-m—k 4% k= 1 
“=~ 5 eS ee 


These two equations may be written by virtue of (3’) 





Consequently the probabilities of the most probable mth and kth values ,,u 
and wu, are 








(4) Pau) ="; @(uy) = 1. 


The expansion of the probabilities ,,& and &, around the modes ,u and wu, leads 
[2, 3] by virtue of (2), (8), (4), to 





n n 
“a = oy Pome) (mt os mu); k p(Ux) (Xx a, Ur). 





Therefore, distributions, subject to L’Hospital’s rules (3), may be said to be of 
the exponential type. Since the derivatives ,g and ¢ are 
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(7) me = m&n® ; Ge = ax(1 om ®;), 
where 


(7’) ma = olmtt); ae = 7 (ur), 


the product of the first two and the last two functions in formula (1) may be 
written as a product of two functions 


(8) i we(1 — &)? = (n< m emmy (as A ee) 
n n 
Clearly, each factor in (8) depends only on one variable. 
In the same way the function of ,,2 and a; in the middle of (1) can be split up 
into a product of two independent functions, each depending only on one vari- 
ate. By virtue of (5) 


&—,P =1-— ~ (men’ + ke”) 


and by virtue of (2) 
(9) (b, — »b)"""* = exp(—me™) exp(—ke™), 
where 

exp(x) = e”. 


From (2) the constant factor (1’) may also be split into a product 


1 ™m k 
ns n n 


is (m—Dik—-D!in—-m—k! (m—-)! &—-D! 


Introducing (10), (9) and (8) into (1), the bivariate distribution of the mth ex- 
treme value from the bottom and the kth extreme value from the top is obtained 
as a product of two independent distributions 


(11) Wrlm, Le) = mI (mt) *fe(re) 


where 


™m 


(12) fmt) = Gey exP(™ ay — men") 


and 


. kX 
Cy exp(—ky, — ke 


p~ 


(12’) f(a) = 


are the distributions of the mth extreme values from the bottom, alone, and of 
the kth extreme values from the top, alone. 
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In the special case m = k and for a symmetrical initial distribution with mean 
zero, the following equations hold 


(13) ma = Ak = An; nt = — Up = — Um- 
(13’) nm? = 1—-& = 1 —4,,; m2 = Yk = Ym- 


and the bivariate distribution of the mth values from the bottom ,,z, and from 
the top x, , is 


(14) Wn(mX, Lm) = mJ (m2) *fm(Zm)» 


where 


(14’) mf (mt) = fm(— Lm) 


is the expression used in the beginning of article [1] 

It follows from (11) that the mth observation in ascending order, and the kth 
observation in descending order, may be dealt with as independent variates 
provided that n is large, the ranks m and k are small, and that the initial con- 
tinuous unlimited distribution is of the exponential type as defined by equations 


(3). 
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A NOTE ON SAMPLING INSPECTION 


By Paut PEAcH AND S. B. LITTAUER 

















North Carolina State College and Newark College of Engineering 


In designing an industrial sampling plan conformable to the Pearson-Neyman 
approach, the operating characteristic is made to pass as nearly as possible 
through two predetermined points. Wald [1] has used this method for setting up 
sequential sampling plans. 

A similar type of single sampling plan can be designed by using tables of the 
incomplete Beta function. Unfortunately, tables of this function are not 
generally available, and the existing tables do not cover the range for large 
sample sizes. 

An approximate solution of the problem for single sampling can be based on the 
widely available tables of percentage points of the chi-square distribution. This 
is equivalent to assuming a Poisson distribution of defectives in the sample, 
utilizing the well known fact that for even degrees of freedom the chi-square 
distribution gives the summation of a Poisson series. 

We use the following well established notation: 
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sample size 

acceptance number 

acceptable fraction defective 
objectionable fraction defective 
risk of rejecting a lot if p = pi. 
risk of accepting a lot if p = pe. 


There seems little to be gained by using a large assortment of possible risk 
values, since the necessary adjustment to secure a desired effect can be made 
on the p’s. We suggest the adoption of .05 as a standard value for both a and £. 
This convention conforms to much existing statistical practice, in particular to 
some existing inspection tables. 

We propose also the use of 


Ro = P2/P1 


which we call the “operating ratio,’’ as a measure of the power of discrimination 
of an inspection scheme. Dodge and Romig [2] used what is essentially the 
reciprocal of Ry as a basis for the construction of sampling plans. Now, assume a 
binomial distribution of defectives in samples and a series of single sampling 
plans with the same c but different n. As n increases, the effective values of 
pi and p» clearly decrease. Their ratio Ro is not constant, but it does not change 
very much after n has got beyond the range of very small samples—say 5(c + 1). 
The value obtained from the chi-square table is the upper limit of Ro for a fixed c 
and increasing n. Since A, is to a first approximation a function of ¢ alone, 
provided n is not very small, it is a useful index for the construction of tables, 
and gives great compactness. 
Using the chi-square approach, we note that 


D. F. = 2c + 2 
oe 
NpPi = 2 X2c+2,1—a 


1 2 
Np2 = 2 X2c+2,8 


2c+2, 
Ry = X2c+2,8 


2 ‘ 
X2c+2,1—a 


Table I gives Ro, c, and np; over a considerable range, with a =B = . 
Given p; and ps, we calculate Rp and use it to enter the table; c is read off directly, 
and the sample size isn = npi/p. . 

Sample sizes obtained in this way will be too large when the true distribution 
of defectives follows the binomial or hypergeometric laws. There is, however, 4 
gain in protection due to the extra inspection. For the binomial case the exact 
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TABLE I 
Single sample inspection plans 
a= 8 = 05 


c 


qn 
oo 


He He Or st CO 





NNW Ww Ww 





53 
44 
37 
30 


2. 
2. 
2. 
2. 
2 


24 
19 
14 
10 
07 


NNN dN bY 


2.03 
2.00 


1.92 

1.81 

1.71 

1.61 

1.51 | 

1.335 | 129 
1.251 | 215 


eens ainsi iieeiniaasaniasimettaS 


In view of the approximate nature of this table due to the Poisson distribution, 
it is suggested that when the calculated value of Ro does not appear, the table be entered 
with the next larger value. This rule will result in partial compensation for the 
approximation. 
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values p, and p, for a given n and c can be calculated, using a table of the 
5 per cent points of the F (variance ratio) distribution. We may take 
= 2(n — c) 
2(c + 1) 
= F(n, ne) 
9 = F(ne, 7). 


Then fet celine tens 
ne + mF 
Ne Fy, 
and in 0b eestor 
” Ni am NM F, ? 
utilizing a property of the F distribution pointed out in [3], page 2. 
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ON AN EQUATION OF WALD 


By Davip BLACKWELL 


Howard University 


Let X,, X2,--- be a sequence of independent chance variables with a com- 
mon expected value a, and let S;, Sz, --- be a sequence of mutually exclusive 


events, S; depending only on Xi, --- , Xz, such that >> P(S,) = 1. Define 
k=1 


the chance variables n = n(X,, X2,---) = k when S; occurs and W = X, + 
--+ + X,. We shall consider conditions under which the equation 


(1) E(W) = aE(n), 


due to Wald [8, p. 142], holds. 

This equation has various interpretations: 

A. n may be considered as defining a sequential test on the X;. If a and 
E(W) are known, (1) may be used to determine E(n), the expected number of 
observations required by the sequential test, [8, . 142 et seq]. 

B. n may be considered as representing a gambling system, i.e. it represents 
the point at which a player decides to stop. W then represents his winnings, 









d 
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ts 
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and (1), in the special case a = 0, says that, if each play is a fair game, then the 
system leads to a fair game. 

C. n may be considered as the duration of a random walk. The meaning of 
W and (1) is obvious. 

More exactly, we shall investigate conditions on X; under which (1) holds 
for every test n of finite -xpected value. Our results, Theorems 1 and 2, are 
that (1) holds if the X;- se identical distributions, or if they are uniformly 
bounded. Theorem 1 is a generalization of a result of Wald [3, p. 142]. 

The test » may be considered as a test on the variables Y; = X; — a. Then 
W’ = ¥1+-:-+ Yn = W — na, so that E(W’) = 0 is equivalent to (1) for 
tests of finite expected value. Thus it is no loss of generality to assume a = 0 
and to seek conditions under which E(W) = 0. We remark that if E(n) does 
not exist, then E(W) need not be zero. For example define X; = +1 with 
probability 3, and n as the smallest integer k for which X; + --- + X, = 1. 
Then E(W) = 1. (It follows from Theorem 1 or 2 that E(n) cannot exist, which 
can also be shown directly.) 

THEOREM 1. Jf Xi, X2, +--+ have identical distributions, E(X;) = 0, E(n) < 
«0, then E(W) = 0. 

Proor: Define chance variables n; inductively as follows: n. = n. Supposing 
m,°***, 2 to be defined, define mey1 = n(Xn,4..4np41, Xni4...+med2, °°*) 


ie. M1, M2, °** are the successive values of n obtained by iterating the test. 
Then 


(2) P(m,--- 


For the event {m1 = a1, --- ,% = ax} = R depends only on Xi, --- , Xa,4...40, 
while under the hypothesis R the event {nz41 = j} coincides with the event S = 
{n(Xo,4...¢a,41,°°*°) = Jj}. Thus Pp(S) = P(S). Finally P(S) = P(S;) 
since S is defined by imposing the same conditions on Xo,+...4,41, °° that S; 
imposes on X1,---,X;. (2) shows inductively that m, n2,--- are defined 
everywhere and are mutually independent with identical distributions. Now 
define Wi = Xnj,+..4np-; 41 + *°* + Xnj,4...4m,- Asimilar argument shows that 








» Mk, Mk = ) = P(S;). 


Wi(= W), We, --+ are also independent variables with identical distributions. 
The strong law of large numbers [2, p. 488] asserts that, with probability one, 
(3) Xit +: + Xn — Q as N —> o. 


N 
It follows that, with probability one, 
Wat --> + Wi, 
Mt ees + mM 
Wit --- + WwW, 


For if | ——_—_—————__| > e for an infinite number of k, 
| Mm ees + Mm 


0 ask > 0. 


then at = | > e« for an infinite number of N, 
| 4 
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which by (3) is an event of probability zero. Also from the strong law of large 
M+ ++: + M% 
k 
Wit -+- + We _ (Zit vee t ws )\( m+ vet a) 
qj = —_—_—__—_—_— ]}— (0 
k Mm+ees +N k 
with probability one. It follows from the converse of the strong law of large 
numbers [2, p. 488] that E(W;) = E(W) = 0. 
Write S; + --+ + S; = Ux, C(Ux) = Viso that Vi = {n > k}. Then (a) 
Vx. depends only on X,, --- , Xz, (b) Vi D V2 > ---, P(Vi) +0. Conversely 
any sequence of sets V; satisfying (a) and (b) defines a sequential test on X; ; 


define n = k on Vi1C(Vi). Moreover E(n) < ~ if and only if (c) >> P(V;) 
k=1 


numbers — E(n) with probability one. Then 


converges [1, p. 297]. Now 


N N 
E(W) = lim >> : (Xi+ --- + Xz) dP = lim > [ (Xi + +--+ Xy) dP 
k Sk 


No k=1 No k=1 


= lim (Xi +--+ + Xy) dP = —lim (Xi + --- Xy) dP. 


N= “Uy Noo Vy 
This establishes the following 


Lemna: If E(X;) = 0, then E(W) = 0 for every test of finite expected value if 
and only if for every sequence of sets Vy satisfying (a), (b), (c), 


[ wt. 4+ xy apo. 
VN 
From this condition we obtain easily 


TuHeoreM 2. If E(X;:) = 0,| X:i| < M, E(n) < , then E(W) = 0. 
Proor: If Vy is a sequence of sets satisfying (a), (b), (c), then 


| | 
| (X; + -+- + Xy) dP| < MNP(Vy). 
VN | 


Now the series = P(Vy) is a convergent series with decreasing positive terms. 
It is well known that under these conditions NP(Vy) — 0. It follows from the 
lemma that E(W) = 0. 

The question of finding sufficient conditions for E(W) = 0 more general than 
those given in Theorems 1 and 2 is of interest. The bare condition E(X;) = 0is 
not sufficient, as the following example (which is simply the system of doubling 
the stake) shows: X; + 2° with probability 3, n is the smallest integer k for which 
X; > 0. A simple computation shows E(n) = E(W) = 2. It is well known 
that the expected amount of capital required for the above system is infinite. 
That this is generally true for such systems is shown by the following theorem, 
in which no hypothesis is made concerning the existence of E(n). 











CORRECTION 


THEOREM 3. If E(X;) = 0, E(W) > 0, then E(Z) = — ©, where 
Z = min (Xi; + +++ + Xx). 
ksn 


Proor: It follows from the proof of the lemma that 


[ Git) + Xy) dP — EM). 


Now on Vy, Z < (Xi +--+ + Xy). Hence 








lim ZdP < —E(W). 


Noo YVN 


Thus E(Z) cannot exist if E(W) > 0,since P(Vy) 0. Since Z < x,,/ Z dP 
zZ=0 


exists; consequently E(Z) = —o. 
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CORRECTION TO THE PAPER “ON A PROBLEM OF ESTIMATION 
OCCURING IN PUBLIC OPINION POLLS” 


By H. B. MAnn 
Ohio State University 











In the paper “On a problem of estimation occurring in public opinion polls’ 
(Annals of Math. Stat., Vol. 16 (1945), pp. 85-90) the author made the assertion 
that, in the notation of the paper, E[(¢; — 7;)"] is always smaller than E[(e; — e;)’]. 
This statement is incorrect and its supposed proof contains a numerical error 
in the fourth line from above on p. 90. 

We have 


. 1 +00 00 2 1 
E(ri) = \/ 29 [ ; os Qro* exp E- Q(z, Y, p» | dx dy dp; 


-i5f [ p| eta | 
“Slain Bat TY - Oe ie4 
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The last integral is tabulated in Karl Pearson’s Tables for Statisticians and 
Biometricians, Vol. 2, p. 93. Comparing this table with a table of the norma] 
probability integral it may be seen that there exists a value é such that 


E(e;) > E(ri) for c¢ < &, 
E(e;) < E(r3) for ec > @. 
The quantity é lies in the neighborhood of 2. 
I am indebted to Professor J. W. Tukey for bringing the error to my attention, 
















NEWS AND NOTICES 
Readers are invited to submit to the Secretary of the Institute news items of interest 
Personal Items 


The following members of the Institute are teaching in Army University Cen- 
ters in Shrivenham, England; Biarritz, France; and Florence, Italy: T. A. 
Bancroft, Alonzo Cohen, E. E. Blanche, P. R. Rider. 

Dean Walter Bartky of the University of Chicago has been appointed as the 
representative of the Institute of Mathematical Statistics to the Division of 
Physical Sciences of the National Research Council. 

Mr. Clyde A. Bridger represented the Institute at the Inauguration of Dr. 
F. S. Harris as President of Utah State Agricultural College on November 16. 

Dr. C. West Churchman has resigned his position at Frankfort Arsenal and has 
accepted the appointment of Assistant Professor of Philosophy at the University 
of Pennsylvania. 

Assistant Professor D. B. DeLury of the University of Toronto has been ap- 
pointed to an associate professorship at Virginia Polytechnic Institute. 

Mr. George Eldredge, formerly with the Aluminum Research Laboratories at 
New Kensington, Pennsylvania is now corrosion chemist with the Shell De- 
velopment Company at Emeryville, California. 

Dr. Will Feller of Cornell University has been appointed as the representative 
of the Institute of Mathematical Statistics on the Policy Committee of the 
Mathematical Organizations. 

M. Bernard Hecht has joined the International Resistance Company, Phil- 
adelphia, as head of the Quality Control Department. 

Lt. Col. Paul Horst has returned to his previous position at Proctor and 
Gamble at Cincinnati. 

Professor Harold Hotelling of Columbia University has been made a part time 
consultant on statistical problems to the Division of Statistical Standards of 
the Bureau of the Budget. 

Dr. 8. B. Littauer is now chairman of the Mathematics Department of New- 
ark College of Engineering at Newark, N. J. 

Lieutenant Commander A. L. O’Toole has been decorated with a Bronze 
Star Medal for his outstanding service in the South Pacific during the past two 
years. 

Associate Professor H. H. Pixley of Wayne University has been appointed 
Assistant Dean of the College of Liberal Arts. 

Dr. H. B. Mann has been appointed to an associate professorship at Ohio 
State University. 

Miss Dorthy J. Morrow has been appointed to an assistant professorship at 
George Washington University. 

Professor C. J. Rees of the University of Delaware has received a citation for 
his work in a civilian capacity with the 14th Air Force Headquarters. 
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Dr. L. V. Toralballa is a special instructor in the Mathematics Department at 
the University of Michigan. 

Associate Professor Abraham Wald of Columbia University has been promoted 
to a professorship. 

Mr. Grover C. Wirick, Jr. is doing graduate work at the University of 
Michigan. 

Henry Goldberg of the Columbia University Statistical Research Group died 
April 19, 1945. 


During the last quarter of 1945, many members of the Institute engaged in 
statistical quality control were favored by visits from Messrs. W. A. Bennett and 
M. Milbourn, the successful candidates in a scholarship competition organized 
by the Quality Control Panel associated with the Midland Region of the British 
Ministry of Production. In addition to the competition, for which with a three 
months’ trip to the United States as a prize, 92 papers on industrial applications 
of statistical methods were submitted. This Panel has been active in organizing 
regular discussion groups and in arranging courses of lectures at the Birmingham 
Technical College, later published by the Birmingham District Committee as 
a ‘Symposium of Papers on Quality Control’’, copies of which are still available. 

Mr. Bennett is Works Manager of the English Needle and Fishing Tackle 
Co., Ltd., of Redditch, and Mr. Milbourn is a physicist who has worked mainly 
in the field of spectrographic analysis and physical metallurgy in the Research 
Department of Imperial Chemical Industries, Metals Division, Birmingham. 
It is natural, therefore, that Mr. Bennett’s paper dealt with the management 
problem of organizing a Statistical Quality Control Bureau and defining its 
duties, whereas Mr. Milbourn’s paper considered the operation of quality control 
techniques as a means for detecting and identifying causes in production research. 

Toward the close of their visit in this country they indicated that the future 
of Quality Control, both here and abroad, will depend on establishing an adequate 
theory of control that includes statistical along with all other necessary factors. 
This provides a challenge that must be answered by the statistical societies and 
the colleges, as well as by the quality control people. 


New Members 
The following persons have been elected to membership in the Institute : 


Bal, Kenan Y. (Columbia) Statistical Control, Hq. AFPDC, 830 West Broadway, Louisville 
3, Kentucky 

Coles, James Stacy, Ph.D. (Columbia) Research Supervisor, Underwater Explosives Re- 
search Laboratory, Woods Hole, Oceanographic Institution, Bor 631, Woods Hole, Mass. 

Frank, David H. Administrative Ass’t, Long Island City H.S., 411 W. 114th St., New 
York 25, N. Y. 

Greider, C. Edwin, Jr., B.A. (Michigan) Actuarial Clerk, 1066 Glenwood Blvd., Schenectady 
Sy es 1 

Gulliksen, Prof. Harold, Ph.D. (Chicago) Psychology Dept., Princeton University, Prince- 
ton, N. J. 
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Harrison, Joseph O., Jr., B.S. (George Washington) 2605 Kingsbridge Ave., Apt. 3F, New 
York, N. Y. 

Hodges, Joseph Lawson, Jr., A.B. (California) Operations Analyst, Army Air Forces, 
1857 Park Road, N.W., Washington 10, D.C. 

Hoskins, Robert Heywood, A.B. (Harvard) Radio Technician, Third Class, U. S. Navy 
Teaching Fellow in Mathematics, Harvard University, Separation 3, Separation Center, 
Shoemaker, California 

Lowry, Edward D. Statistician (Western Cartridge Co., E. Alton) 692 5th St., East 
Alton, Ill. 

Rees, Prof. Carl J., Ph.D. (Pennsylvania) Head of Math. Dept., Univ. of Delaware, 
Newark, Delaware ; 

Seth, Gobind Ram, M.A. (Delhi) Lecturer in Math. Hindu College, Delhi (On Leave) 
1345, John Jay Hall, 116th Street, Columbia University, New York 27, N. Y. 

Silber, Jack, B.S. (Chicago) 4908 N. Springfield Ave., Chicago 26, Ill. 

Stone, Goldie F., A.M. (New York) 678 Dawson St., Bronz, New York, N.Y. 

Szatrowski, Zenon, Ph.D. (Northwestern) Instructor in Economics Department, North- 
western University, Evanston, Ill. 

Wadley, Francis Marion, Ph.D. (Minnesota) Statistical Consultant, Bur. of Entomology 
and Pl. Duar., USDA, 3215 N. Albemarle, Arlington, Virginia 

Waugh, Frederick V., Ph.D. (Columbia) Agricultural Economist (Office of War Mobil. 
and Recon.) 1006-26 Street, South, Arlington, Virginia 





REPORT ON THE CLEVELAND MEETING OF THE INSTITUTE 


A meeting of the Institute of Mathematical Statistics was held in Cleveland, 
Ohio, Thursday to Sunday, January 24-27, 1946 in conjunction with the Annual 
Meetings of the American Statistical Association and the Econometric Society, 
The following 115 members of the Institute attended the meeting: 


Beatrice Aitchison, Armen A. Alchian, Franz L. Alt, Richard L. Anderson, Kenneth J, 
Arnold, Max Astrachan, George J. Auner, Kenan Y. Bal, Walter Bartky, William D. Baten, 
Harold R. Bellison, Archie Blake, Chester I. Bliss, Albert H. Bowker, T. H. Brown, Robert 
W. Burgess, Oscar K. Buros, Irving W. Burr, Burton H. Camp, C. West Churchman, Wil- 
liam G. Cochran, Edward P. Coleman, Francis G. Cornell, Jerome Cornfield, Donald R. G, 
Cowan, Dudley J. Cowden, Gertrude M. Cox, John H. Curtiss, Joseph F. Daly, Cuthbert 
Daniel, Besse B. Day, Walter L. Deemer, Jr., Daniel B. DeLury, W. Edwards Deming, 
Bernard Dempsey, Paul 8. Dwyer, Churchill Eisenhart, Mary L. Elveback, Benjamin 
Epstein, Wilmoth D. Evans, Carl H. Fischer, Irving Fisher, T. N. E. Greville, Trygve 
Haavelmo, Clausin D. Hadley, Margaret J. Hagood, K. W. Halbert, Morris H. Hansen, 
Boyd Harshbarger, Byron R. Hayden, Harold Hotelling, Earl E. Houseman, Leonid Hur- 
wicz, William Hurwitz, Calvin J. Kirchen, Lila F. Knudsen, Hendrik 8S. Konijn, Tjalling 
Koopmans, Morton Kramer, Anita R. Kury, Robert Ladd, Dickson H. Leavens, Roy Leip- 
nik, E. Vernon Lewis, Eugene Lukacs, Henry B. Mann, George F. T. Mayer, Edward C, 
Molina, Alexander M. Mood, Margaret Moore, Joseph E. Morton, Frederick C. Mostel- 
ler, Charles McC. Mottley, Paul M. Neurath, Horace W. Norton, Edwin G. Olds, Paul §, 
Olmstead, Guy H. Orcutt, James G. Osborne, Russell F. Passano, Paul Peach, Alice E. 
Andrews Priestley, James Rafferty, Sophie Rakesky, Charles F. Roos, A. C. Rosander, 
Herman Rubin, Phillip J. Rulon, Marion M. Sandomire, Franklin E. Satterthwaite, Esther 
Schaeffer, Edward M. Schrock, David H. Schwartz, G. R. Seth, Lawrence W. Shaw, Jack 
Sherman, Walter A. Shewhart, Walt R. Simmons, Leslie E. Simon, John H. Smith, J. R. 
Steen, Joseph Steinberg, Henry W. Steinhaus, J. W. Sullivan, Zenon Szatrowski, Ben- 
jamin Tepping, John W. Tukey, Helen M. Walker, W. Allen Wallis, A. E. R. Westman, 
S. S. Wilks, Elizabeth W. Wilson, Charles P. Winsor, Gerald N. Winston, Theodore 0. 
Yntema. 


The first session of the meeting was held jointly with the American Statistical 
Association on Thursday afternoon on Numerical Solution of Regression Equa- 
tions, under the chairmanship of Dr. W. E. Deming of the Bureau of the Budget. 
The following papers were presented: 


. A Machine for Determination of Correlation and Regression Coefficients. 
Dr. Guy Orcutt, Massachusetts Institute of Technology 
. A Square Root Method for the Solution of Regression Equations. 
Mr. D. B. Duncan, Royal Australian Air Force 
. Error Control in Matrix Calculation. 
Dr. F. E. Satterthwaite, Aetna Life Insurance Company 
. The Compact Computation of Canonical Correlations. 
Professor P. 8S. Dwyer, University of Michigan. 


On Friday, a symposium, consisting of a morning and an afternoon section, 
was held jointly with the Econometric Society and the American Statistical 
Association on Estimating Relations from Nonexperimental Observations. Dr. 
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Mordecai Ezekiel acted as chairman of the morning session and Dr. R. L. 
Anderson was chairman of the afternoon session. Of the following four 







































d, papers, the first two, were presented in the morning and the last two in the 
al afternoon: 
y: . The Economist's Problem of Statistical Inference. 
Professor J. Marschak, Cowles Commission 
J 2. Prediction and Structural Estimation. 
he Mr. Leonid Hurwicz, Cowles Commission 
“a 3. Iterative Computation Methods in Estimating Simultaneous Relations. 
4) Dr. T. Koopmans, and Mr. Roy B. Leipnik, Cowles Commission 
“ 4. Multivariate Analysis in Economies. , 
a Professor Gerhard Tintner, Iowa State College 
ng, ; ‘ , . . . 
= On Friday afternoon a session on Experimental Designs and their Analysis 
rve was held jointly with the Biometrics Section of the American Statistical Associa- 
en, tion under the chairmanship of Professor Gertrude Cox of North Carolina State 
_ College. The following papers were presented: 
ing 
4 1. On the Uses of Orthogonal Functions in the Analysis of Incomplete Latin Squares. 
C. Professor D. B. DeLury, Virginia Polytechnic Institute 
_ 2. Use of Adjusting Factors in the Analysis of Data with Disproportionate Subclass Num- 
, bers. 
E, Professor R. E. Patterson, Texas A. and M. College 
ler, 3. Selection of Sample Size for Detecting Treatment Differences. 
her Professor A. M. Mood, Iowa State College 
ack 4. Rectangular Lattices. 
4 Professor Boyd Harshbarger, Virginia Agricultural Experiment Station 
en- 
“. On Saturday, a two-session symposium was held jointly with the Econometric 
a Society and the American Statistical Association on Sampling in the Social 
Sciences. Professor Arnold J. King of Iowa State College acted as chairman for 
ical the morning session and Professor S. S. Wilks of Princeton University presided 
jua- in the afternoon. The following seven papers were presented, of which the first 
get. three were presented in the morning and the remainder in the afternoon: 
1. Problems and Methods of a Sample Survey of Business. 
Mr. M. H. Hansen, Bureau of the Census 
2. Problems of Area Sampling in Agriculture. 
Mr. J. R. Goodman, Bureau of the Census, and Mr. E. E. Houseman, Bureau of Agri- 
cultural Economics 
3. Problems of Area Sampling in Population. 
Mr. B. J. Tepping and Mr. J. S. Steinberg, Bureau of the Census 
4. The Problems of Non-Response. 
Mr. W. N. Hurwitz, Bureau of the Census 
, 5. Systematic Sampling and its Relation to Other Sampling Designs. (Read by Title.) 
tion, Mrs. Lillian H. Madow, Washington 
tical 6. Relative Accuracies of Systematic and Stratified Random Sampling for a Specified Class 
Dr. of Populations. 


Professor W. G. Cochran, Iowa State College 
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7. On the Design of a Sample of Dealers’ Inventories. 
Dr. W. E. Deming, Bureau of the Budget and Dr. Willard Simmons, Office of Price 
Administration 


On Sunday, a symposium was held jointly with the American Statistical 
Association on Acceptance Sampling under the chairmanship of Professor John 
W. Tukey of Princeton University. The morning session of the symposium 
was devoted to acceptance sampling by attributes and the afternoon session to 
acceptance sampling by variables. The following program was presented at the 
morning session: 


Papers: 
1. Prewar Developments. 
Mr. Paul Peach, North Carolina State College 
2. Wartime Developments. 
Professor E. G. Olds, Carnegie Institute of Technology 
Prepared Discussion by: 
Mr. H. R. Bellinson, Army Ordnance Department 
Mr. D. H. Schwartz, Quartermaster Corps 
Professor Walter Bartky, University. of Chicago 
In the afternoon session the following program was presented: 
Papers: 
1. Lot Quality Measured by Average or Variability. 
Lt. Commander J. H. Curtiss, Bureau of ships 
2. Lot Quality Measured by Proportion Defective. 
Mr. W. A. Wallis, Columbia University 
Prepared Discussion: 
Mr. E. M. Schrock, Army Ordnance 
Professor A. M. Mood, Iowa State College 
Professor K. J. Arnold, University of Wisconsin 
Lt. Commander J. F. Daly, Bureau of ships 
Dr. A. E. R. Westman, Ontario Research Foundation 


A business meeting of the Institute was held at 5 p.m. on Saturday afternoon 
at which time reports were made by the President, Secretary-Treasurer, Editor 
and Chairman of the Committee on Development. These reports are all 
printed in the current issue of the Annals. 


Pau 8S. Dwyer, 
Secretary. 





ANNUAL REPORT OF THE PRESIDENT OF THE INSTITUTE 
(For 1945) 


cm 
I. DEVELOPMENT OF PUBLIC APPRECIATION FOR MATHEMATICAL STATISTICS 


The aims of the Institute, as stated in the constitution, are to promote the 
interests of mathematical statistics.. First and foremost, research must go on. 
The Annals must be published and its position maintained as the world’s leading 
journal in mathematical statistics. Meetings must be held to provide for further 
dissemination and discussion of research. But this is not all. We should fall 
short of our opportunities for promoting the interests of mathematical statistics 
if we were to lose sight of the need for creating an environment in which mathe- 
matical statistics and statisticians can thrive and take their proper place for 
rendering the service that they are capable of rendering in the political, industrial, 
and scientific life of the nation. 

A fair share of the efforts of the officers and committees of the Institute this 
past year has been devoted to the creation of this environment. The Institute 
has assumed leadership in several movements of importance in this direction 
and has lost no opportunity to cooperate with other organizations toward the 
same ends. Momentum has thus been given to important developments which 
are bound to affect the scientific advancement and employment opportunities 
of all people engaged in statistical work of any kind, whether it be mathematical 
research, consulting, teaching, major or minor roles in large-scale statistical 
projects, preparing questionnaires, designing experiments, analyzing results, 
formulating conclusions and recommendations, or taking part in any other way 
in the collection or use of statistical data. Briefly, these developments fall 
under three main headings. 

(i) Setting standards of professional competence. The Description of the Pro- 
fession of Statistics, put out by the National Roster this year, has gone a long 
way as a first step toward setting standards of professional competence. The 
officers and many members of the Institute assisted the Roster, particularly 
Professor Harold Hotelling and his Committee on the Teaching of Statistics, 
together with Dr. C. I. Bliss representing the American Statistical Association. 
Although the Roster Description is not intended to represent the official attitude 
of the Institute, it does represent cooperative effort toward cultivation of public 
understanding of statistical work. 

(ti) Raising the standards of teaching. Standards of teaching go hand in hand 
with standards of professional competence. The Institute can proudly point 
to the accomplishments of its Committee on the Teaching of Statistics, which 
under the chairmanship of Professor Hotelling, has persistently set forth stand- 
ards of teaching which are bound to bring about important changes in the ar- 
rangement of statistical courses and organization of statistical teaching. An 
inevitable result will be greater competence in statistical theory, better research, 
and expanding avenues for more effective application of theory. 
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(iii) Promoting public understanding and appreciation for the statisticfan. 
More adequate public appreciation of statistical theory can be brought about in 
several ways. The first two of these are being actively pursued by the officers 
of the Institute. The third constitutes a proposal; and the fourth, an obligation 
incumbent on every member of the Institute. 

First, through joint meetings with other professions such as sociologists, 
economists, psychologists, engineers, biometricians, etc. The Cleveland meet- 
ing is an example; the St. Louis meeting of the AAAS to be held in March is 
another. These joint sessions give opportunity for other groups to become 
aware of the impact of mathematical statistics on their own work, and for mathe- 
matical statisticians to hear of the statistical problems in other fields. Opportu- 
nities for such diffusion of knowledge exist in local chapters as well as in national 
meetings, and every member of the Institute should be on the lookout for oppor- 
tunities to explain how problems in administration, management, economics, 
and manufacturing, are going to require modification in the future owing to new 
work in sampling techniques, acceptance procedures, quality control, and other 
developments of mathematical statistics. 

The federation of statistical societies (see Part III) will afford better means 
than existed heretofore for an admixture of mathematical statistics with fields 
of application, both in national and local meetings. 

Second, through the work of committees whose responsibility is to advise 
professional groups, and government and private research agencies, concerning 
the use of mathematical statistics. A notable example is the Joint Committee 
for the Development of Statistical Applications in Engineering and Manufactur- 
ing, of which Dr. W. A. Shewhart is chairman. The Institute has two repre- 
sentatives on it. Much of the recent advancement of statistics in industry is 
traceable to the work of this committee. 

Third, through the establishment and publication of colloquium lectures as 
recommended by Dr. Shewhart in his report for the preceding year, or of an 
annual Rietz lecture of broad interest as recommended by this years’ Committee 
on Development (cf. Appendix A, Part V). 

Fourth, information through expository nonmathematical articles and lectures 
delivered by leading mathematical statisticians before gatherings of nonstatisti- 
cal groups of professional and business men. Such activity is of course informal 
and without record, carried on by individuals as opportunity permits and not by 
official announcement from the office of the Institute. 


II. LONG-RANGE PLANNING 


Through the work of several of the Institute’s committees, each tackling 
specific areas of enquiry, the Institute is being provided with long-range policies 
and planning. In particular, the reports of the following committees should be 
cited in this connection: 

The Committee on Development (Appendix A) 

The Committee on the Teaching of Statistics (Appendix B) 
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The Committee on Finance (Appendix C) 
The Committee on Policy in Regard to Local Chapters (Appendix D) 

These committees are obviously alive to the recent rapid expansion of mathe- 
matical statistics in industry and government, and to the opportunities that lie 


ahead for developing proper environment for greater expansion and service of 
mathematical statistics. 


III. FEDERATION OF STATISTICAL SOCIETIES 


A movement of extreme importance to all statistical workers is the proposed 
reorganization of the American Statistical Association as the central organization 
for all statistical societies. This movement owes its impetus largely to the 
recommendation made by our Committee on Development a year ago, and to 
the active part that our officers and representatives played in organizing and 
assisting the Inter-Society Committee. This movement is centripetal and 
replaces the centrifugal forces that were splitting statistical organizations. 
Under the new arrangement, statistics will possess a united front on matters of 
common interest, yet each organization will maintain its autonomy. Nothing 
is to be sacrificed in the way of standards of membership, meetings, or publica- 
tions. Economies will be effected through combined office operations. Much 
will be gained through coordinated effort; wide distribution of a journal of 
general methodology and applications; development of public appreciation for 
statistical work through dissemination of reliable information concerning statis- 
tical science and its contributions; cooperation with local and international 
statistical groups; promotion and development of professional standards of 
statistical work; and through cooperation with other professional groups in 
fields of application. 

This federation is not yet accomplished; it is still in process of formulation, but 
it is probably safe to say that agreement on general aims has been reached, as 
well as on many items of detail. The proposition will in time be put up to each 
statistical organization for acceptance. 


IV. GROWTH AND EXPANSION 


During the year the membership increased from 606 to 777. The work of 
the Institute, vitally affecting many thousands of statistical workers through 
its efforts to enhance public confidence and appreciation for theoretical statistics 
as well as to improve the quality of statistical work, extends far beyond the en- 
vironment of its nearly 800 members. Concerted drives for membership should 
continue, but should not be expected to take the place of personal invitation 
in the form of explanation, one man to another, of what the Institute stands for. 
The outlook is encouraging. Year by year as the work and influence of the 
Institute receive wider success and recognition, more and more people will be 
found ready and desirous of joining. 
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V. ADMINISTRATIVE AFFAIRS 


As with any active organization, there are certain chores to be done and inter- 
nal affairs to be administered. The chief burden falls on the executive officer, 
our Secretary-Treasurer, Paul S. Dwyer, who is expected 

i. To keep the list of members up to date with addresses and titles. Furnish 
information to the Board regarding increases and decreases in member- 
ship, and issue the Directory. 

ii. To send out notices, to keep the membership informed concerning meet- 
ings and other items of interest. : 

iii. To send out bills, and keep the books showing payment of dues and sub- 
scriptions. 

iv. To fill orders for back numbers of the Annals. 

. To estimate the probable demand for copies of the Annals, current and 
past, and to place orders with the printer to be able to supply the demand. 

i. With the Committee on Finance, to keep the Board posted on the ex- 
pected expenditures and income for the year ahead. 

ii. To answer correspondence from other organizations and individuals who 
desire information concerning the Institute. 

iii. To keep a record of proceedings of the Board and business meetings of the 
Institute. 

ix. To work with the various committees of the Institute, keeping them in- 
formed and in line on policy, constitution, by-laws, and other commit- 
ments. 

. With the Committee on Programs, to arrange sessions of contributed 
papers, and to find space in hotels or elsewhere for holding meetings and 
housing members. 

. To keep the Board informed concerning recommendations and reports of 
committees, and other matters brought to his attention requiring action 
by the Board. 

xii. To conduct continuous membership and subscription drives with or with- 
out the aid of committees. 

It is obvious that when an organization reaches the size and activity of the 
Institute, these duties are too onerous to carry on without proper assistance. 
Our Secretary-Treasurer should be freed for proper performance of important 
functions which only he can render toward the growth and vitalization of the 
Institute. Consideration is being given to two possible plans, either of which 
will call for some increase in expenditure. One plan is to provide competent and 
sufficient assistance in the office of the Secretary-Treasurer, and the other is to 
transfer some of his duties (e.g., Items i, ii, ili, iv, x, and xii) to the American 
‘ Statistical Association on a cost basis. A cooperative arrangement of this kind 
between the A.S.A. and the Institute has been discussed informally with Mr. 
Lester Kellogg, Secretary of the A.S.A., who will be able to provide us with cost 
estimates a little later. This kind of arrangement would be a first step and serves 
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as a pilot study in cost-accounting for the ultimate federation of statistical 
societies (Part ITI). 

The constitution must be revised, and a committee has been formed to under- 
take the task. The one we have has served well, with minor revisions, over 
the first ten years in the life of the Institute, but conditions are now different and 
thorough reconsideration is needed. Among other things, it needs to be revised 
to permit federation with other statistical societies. As it stands it is totally 
deficient in specifying responsibilities between local chapters and the parent 
society. It should embody the recommendations of the Committee on Policy 
in Regard to Local Chapters, or modifications of these recommendations. + Also, 
there are ambiguities in the present constitution that need to be cleared up, and 
there is no provision for carrying out the business of the Institute by correspond- 
ence when a Board meeting or Committee meeting can not be held. 

The Committee on Meetings must not only seek out suitable papers for meet- 
ings, carrying out the wishes of the Board in regard to the subject-matter to be 
covered, but must also be concerned with the geographic location of meetings, 
cooperation with other professional societies, and choice of dates. During the 
past few years, in addition, this committee has had to contend with restrictions 
on transportation and hotel space. The Committee on Finance must decide 
what expenditures are wise and allowable; they must make decisions on in- 
vestments and surety bonds. They have calculated the price of life-memberships 
for purchase at various ages. Committees on Membership and on Subscriptions 
must be active. The services rendered by these committees deserve the grateful 
thanks of the members of the Institute. 

Undoubtedly the most lasting contribution that is being made by the Institute 
to research in mathematical statistics is the publication of the Annals of Mathe- 
matical Statistics. Without some first-hand knowledge of the problems that are 
encountered in publishing a professional journal of high standing it is hardly 
possible to be conscioys of the depth of the debt owed by the Institute to Dr. 
Samuel 8. Wilks, Editor. During the past few years, in addition to the normal 
editor’s problems of maintaining standards of excellence in the articles published, 
there have been additional difficulties and delays arising from paper and man- 
power shortages in printing. 

In closing this section it is a pleasure to record our appreciation of the as- 
sistance and advice received at various times during the year from Mr. Lester 
Kellogg, Secretary of the A.S.A.; also from Mr. E. A. Stephens of the Ohio Bell 
Telephone Company in Cleveland in regard to the difficult problems of hotel 
space which arose in connection with the Cleveland meeting in January 1946. 




















VI. Exectrion or FELLows 


Acting in consideration of the advice of the Committee on Membership, the 
Board advanced the following members to the grade of Fellow: 
M.S. Bartlett, Cambridge University 
Trygve Haavelmo, The Norwegian Embassy 
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William N. Hurwitz, Bureau of the Census 
John von Neumann, Institute for Advanced Study 


VII. EvLecrion oF OFFICERS 


The following officers were duly nominated and elected for 1946: 
President, William G. Cochran 
Vice Presidents, Will Feller 
Edwin G. Olds 


VIII. CommitrEEs AND REPORTS OF COMMITTEES 


Our committees and representatives on joint committees for the year 1945 are 
shown below. The reports of these committees are appended for the information 
of members. It should be borne in mind that committee reports are for con- 
sideration of the Board; they do not commit the Board to any specific action one 
way or another. As already intimated, every member of the Institute may take 
pride in the splendid work of these committees. Like the deliberations of the 
Board, most of the deliberations of the committees were necessarily carried out 
by correspondence because no large meetings were held at which the members 
of any committee or the Board could all be brought together. 

During the year we have been asked by Dean L. P. Eisenhart, Chairman of 
the Division of Physical Sciences of the National Research Council, to name a 
representative. The Board duly appointed Dean Walter Bartky. The invita- 
tion from Dean Eisenhart to be so represented is a distinct honor and a recogni- 
tion of the importance of the Institute in pure and applied research. 

We have also been invited to name a representative to the Policy Committee 
for Mathematicians, to which the Board has named Professor Will Feller. On 
the committee are four representatives from the American Mathematical So- 
ciety, one from the Society for Symbolic Logic, and one from the Institute of 
Mathematical Statistics. The Mathematical Association of America has been 
invited to name two representatives. The constitution and purposes of this 
committee are explained in the following paragraphs which are taken from a 
statement that was approved by the A.M.S. Council on November 23, 1945: 


Representatives of each organization shall be selected in accordance with a plan approved 
by the governing body of that organization. 

The Secretary of the American Mathematical Society shall be a non-voting, ex officio 
member of the committee and shall act as secretary for the committee. 

The Policy Committee shall study those problems affecting the mathematical profession 
which are the common concern of the constituent organizations. It shall be empowered 
to speak for the constituent organizations on matters which concern the position of mathe- 
matics in such matters as proposed or enacted legislation concerned with science, problems 
concerning the effective use of mathematicians or potential members of our profession, and 
other questions which tend to affect the dignity and the effective position of mathematics 
among related sciences, both nationally and internationally. 

Nothing in the powers of this committee shall be construed to affect any commitments 
already made on a national or international basis by any of the constituent organizations 
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(i.e., among these is the International Congress of Mathematicians for which an invitation 
was issued by the American Mathematical Society in 1936). 

This Policy Committee shall be appointed for a period of five years. At the end of that 
time the work of the committee shall be reviewed and a decision made concerning the con- 
tinuation of the committee. 


A supplemental motion passed by the A.M.S. Council asks the Policy Com- 
mittee to concern itself primarily with the profession of mathematics and only 
secondarily with the teaching of mathematics. 


W. Epwarps DEMING, 
President, 1945. 
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Appendix A 










Report from the Committee on Development 


I. GENERAL 


Continuing the work of the 1944 Committee on Post-War Development, this 
Committee has analyzed the purpose and policy of the Institute to see what 
additional activities the Institute should undertake in order to provide further 
stimulus to the development of the field of mathematical statistics. The fol- 
lowing existing and proposed activities were considered: 

1. Maintenance of professional standards 
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. Publications program 

. Meetings program 

. Rietz Lecture 

. Chapter policy 

. Cooperation in determining educational standards 

. Maintaining relationships with other technical societies 

. Increasing membership of the Institute 
In general, each of these activities is placed in the hands of a committee. Except 
in a few instances, reports of these committees have not been published in the 
Annals. ‘This committee recommends that each of the committees of the Insti- 
tute together with the representatives of the Institute on joint committees be 
requested by the Board of Directors to submit a yearly report for possible publica- 
tion in the March issue of the Annals so that the members of the Institute may 
be kept informed concerning the Institute’s affairs. 


II. PROFESSIONAL STANDARDS 


This committee believes that the Report of the Membership Committee 
published in the March 1945 issue of the Annals is typical of the kind of report 
desired, providing, as it does, an outline of present standards for membership 
in the Institute. 


III. PuBLicatTions 


The publication program has been discussed with the Editor and we find that 
we are in agreement with the present editorial policy. We recommend that the 
Editor submit a yearly report. 

Although an increased membership among those engaged primarily in the 
application of statistics is desirable, it is not considered advisable to alter radically 
the character of the Annals in order to attract such membership. However, 
writers on theoretical topics in the Annals should be encouraged to include illus- 
trations of applications whenever feasible. A desirable goal at which to aim 
would be for every issue of the Annals to contain an expository paper reviewing 
progress in a broad field of theory or devoted to new fields of existing theory 
(these functions are not mutually exclusive). It seems more difficult to obtain 
good papers of this kind than research papers. Now that statisticians are leav- 
ing war work the prospect for obtaining such papers should improve. The 
committee has been informed that the Editor has invited certain writers to 
contribute expository papers on assigned topics and it is recommended that this 
policy be continued. It is believed that the members of the Institute would 
like to be informed in the Editor’s report concerning progress in receiving such 
papers. 

Last year this committee considered the possibility that the Institute sponsor 
the publication of a series of books and monographs. In view of recent develop- 
ments in the commercial publishing field it seems that there is ample opportunity 
for the publication of such works as the Institute might otherwise undertake to 
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publish, and the committee therefore recommends against such Institute action 
at this time. 





















IV. MEETINGS 


Under normal conditions of transportation, the Institute has held at least two 
meetings each year, one with the mathematical] societies in the summer and one 
with the social science societies in the winter. This committee favors the con- 
tinuation of this system. Occasionally, meetings have been held with an en- 
gineering society. This program does not provide specifically for joint meetings 
with societies devoted to (a) standardization, (b) engineering, or (c) natural 
sciences. Arrangements for meetings under (a) and (b) could be made through 
our representatives on the Joint Committee for the Development of Statistical 
Applications in Engineering and Manufacturing, which has representation from 
each of these groups. This committee recommends that the Program Committee 
have on its membership one of the Institute’s representatives on the Joint Com- 
mittee and one who is active in the natural sciences. Important duties of these 
members are to give advice on the type of program desired for joint meetings 
in these applied fields and to make arrangements for the meetings. It is also 
recommended that the Program Committee include Institute members who are 
active in the mathematical societies and in the social science societies so that our 
participation in meetings with these groups will be integral to their programs. 
Other members of the Program Committee may be chosen with similar aims in 
mind. The yearly report of the Program Committee should discuss among 
other matters the progress made in arranging joint meetings. 


V. Rretrz Lecture 


To direct attention to the work of the Institute, it is recommended that the 
Institute sponsor an annual lecture of broad interest, to be named after its first 
president, the late Professor Henry L. Rietz. It is suggested that the lecturer 
be appointed by the Board of Directors, that he be given a year’s notice, and that 
the lecture be arranged for a meeting with an appropriate society. 













VI. CHAPTERS 


In establishing chapters, the Institute has undertaken obligations that to date 
have not been fulfilled. Two courses are open. Either the Institute should 
abolish its existing chapters or it should formulate a policy that will provide for a 
vigorous chapter program. Some requirements for chapters have been set down 
by the Committee on Policy with Regard to Local Chapters (Appendix D). 
It is proposed that this be submitted to the secretaries of our chapters for their 
comments. Further, certain broader aspects of the problem require additional 
consideration. Discussion with various members of the Institute indicates 
that some believe that the interests of the Institute because of its relatively 
small membership might be better served by organizing geographical sections 
rather than chapters. Pending final agreement on these points, this committee 
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recommends that the Board of Directors hold in abeyance any requests for the 
formation of new chapters. 


VII. EpucaTIoNAL STANDARDS 


The matter of educational standards for college courses is now in the hands 
of the Committee on the Teaching of Statistics. Such a committee should be a 
permanent committee of the Institute. 

It is our further recommendation that one member of this committee be one 
of the representatives of the Institute on the Joint Committee for the Develop- 
ment of Statistical Applications in Engineering and Manufacturing. It should 
be his duty to assess needs for statistics courses, particularly in relation to stand- 
ardization and engineering. 


VIII. RELATIONSHIPS WITH OTHER TECHNICAL SOCIETIES 


In 1929, the Joint Committee for the Development of Statistical Applications 
in Engineering and Manufacturing was formed. The Institute has had two 
representatives since 1937. The other sponsor societies for the Joint Com- 
mittee are: 

American Society of Mechanical Engineers 
American Society for Testing Materials 
American Statistical Association 

American Mathematical Society 

American Institute of Electrical Engineers 


Much of the use of statistical method in the war effort is traceable directly to the 


activity of this committee. In particular, this committee is working continu- 
ously to see that statistical methods and statistical concepts are introduced in 
connection with work on standardization, engineering, and the natural and social 
sciences. In a report published in the December 1940 issue of the Annals, the 
Institute’s War Preparedness Committee made the following recommendations: 


The Institute should ‘‘cooperate to the fullest in matters pertaining to quality control 
and specification with the ‘Joint Committee for the Development of Statistical Applica- 
tions in Engineering and Manufacturing,’ of which the Institute is a sponsor.” 


Six specific steps for a cooperative program with the Joint Committee were 
outlined. However, although this report was accepted by the Board, no action 
was taken on these recommendations. In view of the above, we make the 
following recommendations to the Board: 


1. That the Institute’s representatives be requested to make a report on the activities 
of the Joint Committee. (This should be the first of a series of yearly reports.) 

2. That the Board request a report from the Joint Committee on the status of statistics 
and statisticians in engineering and manufacturing including forecasts of future needs 
and opportunities. 

. That the Board request a report from the Joint Committee on the status of statistics 
in the training of engineers including recommendations for such training in the future. 

. That at least one of the Institute’s representatives be from the engineering or manu- 
facturing field. 
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IX. GROWTH OF THE INSTITUTE 







The Committee on Development has examined the record of growth of the 
Institute and finds that the largest increase in recent years has been among 
people from industry, a group that is still less than a quarter of the total mem- 
bership. It is believed that the program outlined above will stimulate growth 
in membership among all users and potential users of mathematical statistics. 









X. PuBLiciziIncG MATHEMATICAL STATISTICS 


This Committee recommends that the Institute make available to appropriate 
channels of public information reliable communications concerning mathematical 
statistics. As a specific recommendation, the case for the science of statistics 
should be presented at the hearings of the National Research (Science) Founda- 
tion Acts pending in Congress, preferably by representatives acting jointly for 
the Institute and the American Statistical Association. 











XI. THe INTERSOCIETY COMMITTEE 


A second meeting of the Intersociety Committee mentioned in last year’s 
report is to be held on December 8th. This Committee feels that consideration 
of proposals for reorganization of the Institute should not be undertaken prior 
to advice concerning the action of that Committee. 
W. G. Cocuran, Chairman P.S. OtmsTEaD, Acting Chairman 
C. I. Buiss C. C. Craig 
F. C. MostTELLER H. ScHEFFE£ 

November 5, 1945 
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Report from the Committee on the Teaching of Statistics 


A preliminary draft of recommendations in the teaching of statistics was read 
by the chairman of this committee at the Rutgers Meeting at the Institute in 
September 1945. These recommendations are at present being re-drafted by 
members of the Committee and it is hoped that they would be ready to present 
to the Board in the near future for possible publication in the Annals. 

Assistance was rendered during the first part of the year to the National Roster 
of Scientific and Specialized Personnel, toward the development of a formal 
description of the profession of statistics (mentioned in Part I of the Annual 
Report of the President). This assistance was carried out jointly with Dr. 
Chester I. Bliss who was appointed by the American Statistics Association to 
assist with this project. It is believed by this Committee that the description 
put forth by the Roster will help bring about recognition of standards of pro- 
fessional competence in statistics and in the teaching of statistics. 

Haroup Hore uine, Chairman 

WALTER BARTKY 
Mitton FRIEDMAN 
W. Epwarps DEMING 
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Appendix C 


Report from the Committee on Finance 


The Committee on Finance met in the office of Dr. C. F. Roos in New York 
City on September 14, 1945. Present were Messrs. Roos, C. H. Fischer, P. 8. 
Dwyer; absent, A. C. Olshen. 

The Treasurer presented a summary of income and expenses during the third 
quarter of 1945 through September 13. This information was considered along 
with the first half year reports which were prepared some months ago. The 
Treasurer also presented a graph showing balance on hand at the end of each 
month (1939-1945) and one showing income during each month (1939-1945). 
These facts, as well as other pertinent information, were used in formulating the 
recommendations which follow. 

The Finance Committee proposes to the Board of Directors that the following 
recommendations be approved by the Board as policy for the Institute of Mathe- 
matical Statistics. 

1. That no revision be made with reference to the adoption of the expected 
budget for 1945. It appears now that the income will be somewhat higher than 
the amount indicated on the expected budget ($6450) and that the amount of 
expense should be somewhat lower the amount there estimated ($6050). 

2. That the Secretary-Treasurer be instructed to prepare an Annual statement 
for 1945 on the general plan of previous annual statements with the addition of 
an analysis of assets and liabilities. The main assets are cash, bonds, and back 
issues of the Annals. It is recommended that the back issues be valued at 75 
cents per copy (for inventory purpose)—a fair estimate of cost. It is further 
recommended that no value be placed on exchanges and office equipment. 

3. That the Secretary-Treasurer prepare the annual statement prior to the 
winter meeting, which means presumably that the books will be closed about 
December 10th. 

4. That, in consideration of the nature of the graph of the income of the 
Institute, the Institute adopt the policy of having its yearly report run from 
July 1 to July 1 and that the Secretary-Treasurer be instructed to draw up an 
additional annual report as of June 30, 1946. 

5. That the Secretary-Treasurer be instructed to draw up a budget for 1946 
and to submit it to the Finance Committee in sufficient time so that action may 
be taken on it by the Board at its winter meeting. 

6. That the U. S. Government G Bonds now owned by the Institute ($3000) 
be listed on the books at their face values even though the market values of these 
bonds are slightly lower. 

7. That the total amounts of all life membership payments be placed in a 
special life membership fund and that these funds, at least twice a year, be used 
in the purchase of U.S. Government F Bonds. The market value of these bonds 
shall be used in determining the amount of this fund at any accounting period. 

8. That the Secretary-Treasurer be authorized to take whatever steps are 
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necessary to obtain adequate interest on our liquid assets. That he maintain 
sufficient cash position to carry on the business transactions of the Institute and 
that he invest the remainder (a) either in U. S. Government G bonds or (b) in 
short term bonds. 
9. That the purchase from Professor Carver of all back issues jointly owned by 
Professor Carver and the Institute be made an item of the budget for 1946. 
10. That the Secretary-Treasurer be instructed to purchase a $2,000 fidelity 
Bond Form B (a form which covers negligence as well as dishonesty) for 3 years 
for the office of Secretary-Treasurer. 
11. That a policy be adopted of allowing a straight 10% discount to all agencies 
and booksellers who send us subscriptions or orders for back issues. 
12. That the Institute set up a permanent Committee on Finance with the 
Secretary-Treasurer as ex-officio member and chairman. There shall be three 
additional members with terms of three years with a new member each year. 
At the formation of the Committee one member shall be appointed for one year, 
one for two years, and one for three years. A resignation from the Committee 
shall be followed by an appointment for the unexpired term. 
13. That the Board notify any committee working on revision of the Constitu- 
tion and By-Laws that it is supporting a permanent committee on Finance and 
believes it appropriate that a statement of the organization and duties of this 
committee should appear in the By-Laws. 
Pau. S. Dwyer, Chairman 
Cari H. FiscHER 
ABRAHAM C. OLSHEN 
CuaARLEs F. Roos 

September 15, 1945 
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Appendix D 
Report from the Committee on Policy with Regard to Local Chapters 


Attached to this report is a summary of provisions for organizing and working 
with local chapters; it might be cast into appropriate form and incorporated into 
the Constitution of the Institute. From these recommended provisions it will 
be clear that this committee does not favor the organization of weak inactive 
chapters. Unless the membership of the Institute grows substantially it will 
be possible to have only a very limited number of local chapters under these 
provisions. 

It is the opinion of the Committee that it is desirable for members of the In- 
stitute to amalgamate with members of other statistical organizations in the same 
area to form local statistical societies. We believe this will build stronger local 
statistical organizations and will effect greater advances in the application and 
development of effective statistical methods. Such amalgamation in the formu- 
lation of local societies can best be stimulated, and national leadership provided, 
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after the national statistical organizations have accomplished a federation or 
amalgamation. We therefore urge the Institute to use its influence in stimulat- 
ing discussion and action concerning national federation or amalgamation. 
The following further comments are made in addition to or supplementing 
those provisions recommended for incorporation into the Constitution of the 
Institute: 
1. Do not accept or reject the petition from any group until a plan of organiza- 
tion is formulated. There should be clearance on the following questions: 
a. What are the reciprocal responsibilities of chapters and the parent 
organization? What type of chapter activity should the Institute 
seek to promote? What kind of things can chapters do that will 
advance the purposes for which the Institute exists? 

We have indicated in the recommended provisions that the Presi- 
dent of the Institute should personally undertake or designate someone 
to work with the chapters in answering these and similar questions. 

. If local chapters are not active will they hinder the efforts of the parent 
organization? We believe that the existence of an inactive organiza- 
tion is a detriment to development of an active statistical group in a 
community. Activity can be measured in various ways: 

a. Meetings for research in mathematical statistics 

b. Joint meetings with other professions 

c. Bringing in new members to the parent organization 

d. Annual election of officers 

. If members of a chapter must be members of the parent organization, the 
Secretary-Treasurer of the Institute should notify the secretary of a local 
chapter whenever a new member joins within his area. 

. It is recommended that if a local chapter desires it, bills for Institute dues 
contain provision for collection of local dues. 

. The Institute should not allow any local group to use its name unless the 
group contributes to the accomplishment of the aims of the Institute. 

Morris H. Hansen, Chairman 
GERTRUDE Cox 
SAMUEL S. WILKS 


Suggested Article on Local Chapters for addition to the Constitution 


1. Local chapters of the Institute of Mathematical Statistics may be organ- 
ized to promote the work of the Institute by a local organization of members 
who are resident within a given limited territory. 

2. The members of the local chapter shall be members of the Institute. 

3. A local chapter may be established upon acceptance by the Board of Direc- 
tors of a petition signed by at least twenty-five members of the Institute residing 
in the area the chapter is to serve. 

4. Local chapters shall elect their own officers, designate committees, assess 
dues, and make any rules for their government not inconsistent with the Con- 
stitution of the Institute of Mathematical Statistics. 
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5. The affairs of local chapters shall be in general charge of the President of the 
Institute or a representative assigned by him to be responsible for local chapters, 
under the Direction of the Board of Directors. 

6. Any local chapter will be dissolved by: 

(a) failing for two successive years to maintain a paid membership of at 
least 25 members or to hold at least one meeting per year which shall include 
election of officers; or 

(b) by vote of the Board of Directors of the Institute 
7. Each local chapter shall transmit a report to the Secretary-Treasurer of the 

Institute within 30 days of the annual business meeting, reporting among other 
things, on its officers, the number of members, and on the meetings held during 
the year. 

en 


Appendix E 
Report from the Committee on Meetings 


A meeting was held at Rutgers University on Sunday Sept. 16, which was 
attended by 115 members of the Institute. Simultaneously a meeting was held 
by the American Mathematical Society. The first session, which commenced 
at 10 a.m. was a symposium on sequential analysis. The chairman was Professor 
W. Allen Wallis of Stanford University and Director of the Statistical Research 
Group at Columbia University. The speakers and their titles are listed below. 


1. Theory of sequential analysis. 
Professor A. Wald, Columbia University 

2. Construction of multiple sampling inspection plans for attributes from sequential prin- 
ciples. 
Dr. Milton Friedman, National Bureau of Economic Research and the Statistical 
Research Group 

3. Applications of sequential analysis to the ranking of two populations with respect to a 
single parameter. 


Mr. Meyer A. Girshick, Bureau of Agricultural Economics and the Statistical Re- 
search Group 


The afternoon session was a series of contributed papers, followed by a pre- 
liminary report from the Institute’s Committee on the Teaching of Statistics, 
which was delivered by Professor Harold Hotelling. Dr. W. Edwards Deming, 
President of the Institute, was chairman of this meeting. ' The list of contributed 
papers follows hereunder. 


1. On the variance of a random set in n dimensions. 
Dr. Herbert E. Robbins, The Post Graduate School, Annapolis 

2. The non-central Wishart distribution and its application to problems in multivariate 
analysis. 
Dr. T. W. Anderson, Jr., Princeton University 

3. The effect on a distribution function of small changes in the population function. 
Professor Burton H. Camp, Wesleyan University 

4. On composite distributions. 

Dr. Casper Goffman and Dr. Benjamin Epstein, Westinghouse Electric and Manu- 

facturing Company 
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. Population, expected values, and sample. 
Professor Emil J. Gumbel, New School for Social Research 

. On the selection of a sample in repeated steps. 
Dr. William G. Madow, Bureau of the Census 

. On optimum estimates for stratified samples. (Presented by Margaret Gurney, Bureau 
of the Census) 
Mr. Morris H. Hansen and Mr. William N. Hurwitz, Bureau of the Census 

. Pearsonian correlation coefficients associated with least squares theory. (Presented by 
title) 
Professor Paul S. Dwyer, University of Michigan 


At this writing preparations are being made for a meeting to be held in Cleve- 
land, January 24-27, 1946, and for a meeting with the A.A.A.S. to be held in 
St. Louis, March 27-30. 

JoHN H. Curtiss, Chairman 
T. Koopmans 
Witi1am G. Mapow 
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Appendix F 
Report from the Committee on Membership 


The Committee, after study and consideration, recommended to the Board of 
Directors that Messrs. M. 8. Bartlett, T. Haavelmo, William N. Hurwitz, and 
John von Neumann be advanced to the grade of Fellow. This recommendation 
was approved by the Board. 

The Committee, with the advice and approval of the Board is preparing a 
letter to be sent to groups of people who are not members of the Institute to call 
their attention to the work of the Institute. This letter will be accompanied by 
reprints of a recent paper by Wald and Wolfowitz on Sampling inspection-plans 
for continuous production, with a brief explanation of the field covered by the 
Wald-Wolfowitz paper, and the statement that it and others that have appeared 
in recent issues of the Annals have already modified statistical practice in im- 
portant ways. 

JosEPH L. Doon, Chairman 
Pau.t 8. DwyEr 

T. KoopMANs 

WILL FELLER 
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Appendix G 


Report from the Committee for Increasing Subscriptions to Libraries and 
Laboratories 


This committee prepared suitable literature to send to prospective subscribers. 
This literature contained a concise description of the nature of the Annals, a 
table of contents for a year, and a subscription blank. 
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Alphabetical lists of public, college, university and industrial libraries were 
prepared. These lists contained the name, the librarian, and the address of each 
library. They were checked for duplicates for present subscribers and sent to 
Professor Dwyer, Secretary-Treasurer. Altogether, the list contained about 
1500 libraries. 

Professor Dwyer took care of printing the literature, further checking for 
duplicates, addressing the envelopes, and mailing. 

Wituiam DoweELu Baten, Chairman 
Haroutp F. Dopce 

Irvinc W. Burr 

L. AROIAN 





ANNUAL REPORT OF THE SECRETARY-TREASURER OF THE 
INSTITUTE 


(For 1945) 


Accounts of the Rutgers meeting of the Institute appeared in the September 
issue of the Annals. Notices of meetings of the Washington Chapter have been 
sent out from the office of the Secretary-Treasurer. 

Due to a large extent to activity of the members, the Institute has enjoyed a 
large increase in memberhip during the year. The 606 members of a year ago 
have increased to 777. This is an increase of over 28%. 

The Secretary-Treasurer wishes to acknowledge the continued assistance of 
Professor Lloyd Knowler in looking after the back issues of the Annals which 
are stored at Iowa City. 

The following financial statement is drawn up along lines specified by the 
Finance Committee and the Board of Directors. It covers the period December 
31, 1944 to December 31, 1945. 


FINANCIAL STATEMENT 
December 31, 1944, to December 31, 1945 
A. RECEIPTS 


BALANCE ON BIAND, IPECHMBER IS, TOE. ... uc occ ci ced sea veces aces cee a $6 , 790.65 
Me ter ie ts ee ty EN Cs Sead echt an aia SURES able Gees evel 4,108.40 
eee eI FE AIS oe oo oie na ode eee dea needa cas uers owes eins 885.00 
SUBSCRIPTIONS...... ote ,915.73 


Seat GO PAO OMENS... acc ks. coc baie because dibewe es ae ; 46 
INCOME FROM INVESTMENTS. . ............cscc0cccsecsens 
MISCELLANEOUS... 


ANNALS—CURRENT 

Office of Editor Ta 

Waverly Press. .. .... 4,056.42 

$4,456.42 

ANNALS—Back NUMBERS 

Purchase from H. C. Carver. . 

Reprinted 300 copies 

Vol. I No. 2, Vol. 
Iowa City Office. .. 


1,053.01 
OFFICE OF PRESIDENT 130.25 
MATHEMATICAL REVIEWS. 100.00 
OFFICE OF THE SECRETARY-TREASURER 
Printing, Mimeographing, programs, etc. 
envelopes) 
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od soon gon gence oe RAR oR 166.25 
Clerical help 





1,773.78 
IIS sats act oS Sina c pp a rk occ elo nee a iteg Ss aE has ED we oi ales 50.76 


BALANCE ON Hanp, DECEMBER 31, 1945 (Cash and Bonds) 


$15,112.44 





C. SummMary oF REcEIPTS AND EXPENDITURES 













Dacance On Hlanp,” Derma Si, TOG8 go... oc ccsacwicewccececsecacddsevaes $6,790.65 
Ns TENE RN in S65 Gs a wo spc, Seana lag ia ie eid aiotaneiw's Hib se Siar RE ere 8,321.79 
een MINN CODEN INI 555 sess. caine nia wd, wow S wip Wrap Wie wi brely a wie les Gwe Siw 7,564.22 
BAtaAnce OW FIAND,” TOBGHMBERS1, TOEB. . 2.2.5. iciic si ccs ccseeseeseeses 7,548.22 
Net Excess OF RECEIPTS OVER EXPENDITURES, 1945 



















D. CoMPARISON OF ASSETS ON DECEMBER 31, 1944 AND DECEMBER 31, 1945 


1944 1946 
US Government G Bonds....................... $3,000.00 $6,000.00 
Life Membership FPunds................0.000e008 330.00 Bank | 888.00 F Bonds 


\327.00 Bank Dep. 
Additional Bank Deposits....................... 3,460.65 333.22 


Current Accounts Receivable............ Be Cece 303.73 255.35 
Estimated Value (Cost )** 
Of back issues of Annals 


ies sion ha ao ON ESS eS awe 4,219.25 3,825.75 

ay GN MN Sonex eicsisaroo Gare oa ea oeets Ra 567 .00 1,242.80 
Deduct Estimated Value of issues owned by H.C. 
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E. LiaBinities oF INSTITUTE OF MATHEMATICAL STATISTICS AS OF DECEMBER 31, 1945 





All bills which have been presented have been paid and there are no outstanding ac- 
counts against the Institute of appreciable size. The $1215 in Life Membership payments 
require the Institute to provide the privileges of membership for life for the 17 members 
who have made payments. About $2500 should be credited to 1946 dues and subscriptions. 


PAULS. DWYER 
Secretary-Treasurer. 











December 31, 1945 







* In form of bank deposit and government bonds. 
** Value of Annals calculated at 75 cents percopy. All 1944 figures and 1945 Ann Arbor 
figures based on physical inventory. 1945 Iowa City figures based on book inventory. 





ANNUAL REPORT OF THE EDITOR 
(For 1945) 


In spite of the war, enough papers in mathematical statistics have ’ been 
proposed for publication in the Annals in 1945 to keep the total volume of ma- 
terial at approximately 450 pages, the level which has been maintained during 
the last few years. A total of 40 papers were published in the 1945 volume of the 
Annals of which 14 were short notes published in the “notes” section. The 
outlook for a sufficient number of acceptable papers to maintain the usual volume 
of publication during 1946 looks quite favorable. Many mathematical statis- 
ticians who were engaged in war work are now free to resume their research. 
In some cases statistical theory developed in connection with classified war 
research projects can be expected to be declassified in the near future and made 
available for open publication. 

Most of the material which has been published in the Annals consists of original 
research or extensions of work already published in mathematical statistics as 
contrasted with material of an expository character. In view of the considerable 
number of newcomers into the Institute, as well as a general increase of interest 
in probability and statistics during recent years, it would be highly desirable to 
publish more expository or survey material. Invitations have been accepted by 
several individuals to prepare expository articles, but they have been so heavily 
burdened with extra work during the war that they have been unable to complet: 
their tasks. It is hoped that circumstances will now permit the preparation of 
expository articles. 

On behalf of the Editorial Committee for the Annals, the Editor takes this 
opportunity to acknowledge with thanks the refereeing assistance which has 
been received from the following individuals during 1945: R. L. Anderson, T. W. 
Anderson, George W. Brown, A. H. Copeland, W. J. Dixon, J. L. Doob, Milton 
Friedman, M. A. Girshick, M. Kac, T. Koopmans, Carl Kossack, D. H. Lehmer, 
H. B. Mann, P. J. McCarthy, F. C. Mosteller, H. E. Robbins, J. W. Tukey, 
W. A. Wallis, J. D. Williams, and C. P. Winsor. The Editor is also indebted to 
the following individuals at Princeton University for preparation of manu- 
scripts for the printer, and other editorial assistance from time to time in con- 
nection with the Annals: Mrs. Gladys B. Huling, Luis F. Nanni, Mrs. Euthie 
Ross, Mrs. Eleanor C. Schoenly, and John E. Walsh. 

S. S. Witks 
Editor 
December 31, 1945 





CONSTITUTION 
OF THE 
INSTITUTE OF MATHEMATICAL STATISTICS 


ARTICLE I 
NAME AND PURPOSE 


1, This organization shall be known as the Institute of Mathematical Statistics. 
2. Its object shall be to promote the interests of mathematical statistics. 


ARTICLE II 
MEMBERSHIP 


1. The membership of the Institute shall consist of Members, Fellows, Honorary 
Members, and Sustaining Members. 

2. Voting members of the Institute shall be (a) the Fellows, and (b) all others, Junior 
members excepted, who have been members for twenty-three months prior to the date 
of voting. 

3. No person shall be a Junior Member of the Institute for more than a limited term as 
determined by the Committee on Membership and approved by the Board of Directors. 


ARTICLE III 


OFFICERS, BOARD OF DIRECTORS, AND COMMITTEE ON MEMBERSHIP 


1. The Officers of the Institute shall be a President, two Vice-Presidents, and a Secre 
tary-Treasurer. The terms of office of the President and Vice-Presidents shall be one 
year and that of the Secretary-Treasurer three years. Elections shall be by majority 
ballots at Annual Meetings of the Institute. Voting may be in person or by mail. 

(a) Exception. The first group of Officers shall be elected by a majority vote of the 
individuals present at the organization meeting, and shall serve until December 31, 1936. 

2. The Board of Directors of the Institute shall consist of the Officers, the two previous 
Presidents, and the Editor of the Official Journal of the Institute. 

3. The Institute shall have a Committee on Membership composed of a Chairman and 
three Fellows. At their first meeting subsequent to the adoption of this Constitution, the 
Board of Directors shall elect three members as Fellows to serve as the Committee on 
Membership, one member of the Committee for a term of one year, another for a term of 
two years, and another for a term of three years. Thereafter the Board of Directors shall 
elect from among the Fellows one member annually at their first meeting after their elec- 


tion for a term of three years. The president shall designate one of the Vice-Presidents as 
Chairman of this Committee. 


ARTICLE IV 


MEETINGS 


1. A meeting for the presentation and discussion of papers, for the election of Officers, 
and for the transaction of other business of the Institute shall be held annually at such 
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time as the Board of Directors may designate. Additional meetings may be called from 
time to time by the Board of Directors and shall be called at any time by the President 
upon written request from ten Fellows. Notice of the time and place of meeting shall be 
given to the membership by the Secretary-Treasurer at least thirty days prior to the 
date set for the meeting. All meetings except executive sessions shall be open to the 
public. Only papers accepted by a Program Committee appointed by the President may 
be presented to the Institute. 

2. The Board of Directors shall hold a meeting immediately after their election and 
again immediately before the expiration of their term. Other meetings of the Board 
may be held from time to time at the call of the President or any two members ef the 
Board. Notice of each meeting of the Board, other than the two regular meetings, 
together with a statement of the business to be brought before the meeting, must be 
given to the members of the Board by the Secretary-Treasurer at least five days prior to 
the date set therefor. Should other business be passed upon, any member of the Board 
shall have the right to reopen the question at the next meeting. 

3. Meetings of the Committee on Membership may be held from time to time at the call 
of the Chairman or any member of the Committee provided notice of such call and the 
purpose of the meeting is given to the members of the Committee by the Secretary- 
Treasurer at least five days before the date set therefor. Should other business be passed 
upon, any member of the Committee shall have the right to reopen the question at the 
next meeting. Committee business may also be transacted by correspondence if that 
seems preferable. 

4. At a regularly convened meeting of the Board of Directors, four members shall 
constitute a quorum. At a regularly convened meeting of the Committee on Member- 
ship, two members shall constitute a quorum. 


ARTICLE V 


PUBLICATIONS 


1. The Annals of Mathematical Statistics shall be the Official Journal for the Institute. 
The Editor of the Annals of Mathematical Statistics shall be a Fellow appointed by the 
Board of Directors of the Institute. The term of office of the Editor may be terminated 
at the discretion of the Board of Directors. 

2. Other publications may be originated by the Board of Directors as occasion arises. 


ARTICLE VI 
EXPULSION OR SUSPENSION 


1. Except for non-payment of dues, no one shall be expelled or suspended except by 
action of the Board of Directors with not more than one negative vote. 


ARTICLE VII 


AMENDMENTS 


1. This constitution may be amended by an affirmative two-thirds vote at any regu- 
larly convened meeting of the Institute provided notice of such proposed amendment 
shall have been sent to each voting member by the Secretary-Treasurer at least thirty 
days before the date of the meeting at which the proposal is to be acted upon. Voting 
may be in person or by mail. 





BY-LAWS 


BY-LAWS 
ARTICLE I 


DvTIESs OF THE OFFICERS, THE EpiTtor, Boarp OF DIRECTORS, AND 
CoMMITTEE ON MEMBERSHIP 


1. The President, or in his absence, one of the Vice-Presidents, or in the absence of the 
President and both Vice-Presidents, a Fellow selected by vote of the Fellows present, 
shall preside at the meetings of the Institute and of the Board of Directors. At meetings 
of the Institute, the presiding officer shall vote only in the case of a tie, but at meetings 
of the Board of Directors he may vote in all cases. At least three months before the date 
of the annual meeting, the President shall appoint a Nominating Committee of three 
members. It shall be the duty of the Nominating Committee to make nominations for 
Officers to be elected at the annual meeting and the Secretary-Treasurer shall notify all 
voting members at least thirty days before the annual meeting. Additional nomina- 
tions may be submitted in writing, if signed by at least ten Fellows of the Institute, up to 
the time of the meeting. 

2. The Secretary-Treasurer shall keep a full and accurate record of the proceedings 
at the meetings of the Institute and of the Board of Directors, send out calls for said 
meetings and, with the approval of the President and the Board, carry on the corre- 
spondence of the Institute. Subject to the direction of the Board, he shall have charge 
of the archives and other tangible and intangible property of the Institute and once a year 
he shall publish in the Annals of Mathematical Statistics a classified list of all Members and 
Fellows of the Institute. He shall send out calls for annual dues and acknowledge receipt 
of same; pay all bills approved by the President for expenditures authorized by the Board 
or the Institute; keep a detailed account of all receipts and expenditures, prepare a finan- 
cial statement at the end of each year and present an abstract of the same at the annual 
meeting of the Institute after it has been audited by a Member or Fellow of the Institute 
appointed by the President as Auditor. The Auditor shall report to the President. 

3. Subject to the direction of the Board, the Editor shall be charged with the responsi- 
bility for all editorial matters concerning the editing of the Annals of Mathematical Sta- 
tistics. He shall, with the advice and consent of the Board, appoint an Editorial Commit- 
tee of not less than twelve members to co-operate with him; four for a period of five years, 
four for a period of three years, and the remaining members for a period of two years, ap- 
pointments to be made annually as needed. All appointments to the Editorial Com- 
mittee shall terminate with the appointment of a new Editor. The Editor shall serve as 
editorial adviser in the publication of all scientific monographs and pamphlets authorized 
by the Board. 

4. The Board of Directors shall have charge of the funds and of the affairs of the 
Institute, with the exception of those affairs specifically assigned to the President or to 
the Committee on Membership. The Board shall have authority to fill all vacancies 
ad interim, occurring among the Officers, Board of Directors, or in any of the Committees. 
The Board may appoint such other committees as may be required from time to time 
to carry on the affairs of the Institute. The power of election to the different grades of 
Membership, except the grades of Member and Junior Member, shall reside in the Board. 

5. The Committee on Membership shall prepare and make available through the 
Secretary-Treasurer an announcement indicating the qualifications requisite for the 
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different grades of membership. The Committee shall review these qualifications period. 
ically and shall make such changes in these qualifications and make such recommendations 
with reference to the number of grades of membership as it deems advisable. The power 
to elect worthy applicants to the grades of Member and Junior Member shall reside in the 
Committee, which may delegate this power to the Secretary-Treasurer, subject to such 
reservations as the Committee considers appropriate. The Committee shall make recom- 
mendations to the Board of Directors with reference to placing members in other grades 
of membership. The Committee shall give its attention to the question of increasing the 
number of applicants for membership and shall advise the Secretary-Treasurer on plang 
for that purpose. 


ARTICLE II 


DvuEs 


1. Members shall pay five dollars at the time of admission to membership and shall 
receive the full current volume of the Official Journal. Thereafter, Members shall pay 
five dollars annual dues. The annual dues of Junior Members shall be two dollars and 
fifty cents. 

The annual dues of Fellows shall be five dollars. The annual dues of Sustaining 
Members shall be fifty dollars. Honorary Members shall be exempt from all dues. 

(a) Exception. In the case that two Members of the Intitute are husband and wife 
and they elect to receive between them only one copy of the Official Journal, the annual 
dues of each shall be three dollars and seventy-five cents. 

(b) Exception. Any Member or Fellow may make a single payment which will be 
accepted by the Institute in place of all succeeding yearly dues and which will not other- 
wise alter his status as a Member or Fellow. The amount of this payment will depend 
upon the age of this Member or Fellow and will be based upon a suitable table and rate of 
interest, to be specified by the Board of Directors. 

(c) Exception. Any Member or Junior Member of the Institute serving, except asa 
commissioned officer, in the Armed Forces of the United States or of one of its allies, may 
upon notification to the Secretary-Treasurer be excused from the payment of dues until the 
January first following his discharge from the Service. He shall have all privileges of 
membership except that he shall not receive the Official Journal. However during the 
first year of his resumed regular membership he may have the right to purchase, at $2.50 
per volume, one copy of each volume of the Official Journal published during the period 
of his service membership. 

2. Annual dues shall be payable on the first day of January of each year. 

3. The annual dues of a Fellow, Member, or Junior Member include a subscription to 
the Official Journal. The annual dues of a Sustaining Member include two subscriptions 
to the Official Journal. 

4. It shall be the duty of the Secretary-Treasurer to notify by mail anyone whose dues 
may be six months in arrears, and to accompany such notice by a copy of this Article. 
If such person fail to pay such dues within three months from the date of mailing such 
notice, the Secretary-Treasurer shall report the delinquent one to the Board of Directors, 
by whom the person’s name may be stricken from the rolls and all privileges of member- 
ship withdrawn. Such person may, however, be re-instated by the Board of Directors 
upon payment of the arrears of dues. 





BY-LAWS 


ARTICLE III 
SALARIES 


1. The Institute shall not pay a salary to any Officer, Director, or member of any 
committee. 


ARTICLE IV 


AMENDMENTS 


1. These By-Laws may be amended in the same manner as the Constitution or by a 
majority vote at any regularly convened meeting of the Institute, if thg proposed amend- 
ment has been previously approved by the Board of Directors. 
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